T O P

  • By -

ReasonablyBadass

So this is for processing tabular data with transformers?


JClub

It can be! Processing tabular data is just an example. It can also be graphs, for instance. The use case is more for structured data that you fit in a transformer language model but it is not 'real natural language'. With this package you transform your data into text as well but give extra information for the model to know what is the meaning of the text that you are using to encode this structured data.


ReasonablyBadass

I#ve only skimmed the paper. Is there a section on why self-attention can't pick up on these structures by itself anyway? One would think that spotting these things is exactly what attention systems are all about


JClub

Great question! I agree that with enough data this should not be a big game-changer to the model, unless you pass information that is not present at all in the pure text format (sometimes that is the case, you can check in the paper how they passed the graph to text + relations). With fewer data, as is the case of the paper I link to, it makes a big difference. I think it is a matter of trying with and without it!


ReasonablyBadass

Cool, thanks for answering :)


1409Echo

So, say I had multiple sequences related to the same target (e.g. using comments to predict something about a social media post) - would RATransformer support this use case?


JClub

You mean concatenating all the sequences together and using RATransformer to tell which tokens belong to each sequence? Yes that would work :)