FLAN-T5-XL

by Google

Flan-t5-XL model available on Huggingface is a Large Language Model that is capable of various language generation tasks. As we had a good impression from our experiment on flan-xxl model, we like to check other versions available.

FLAN-T5-XL

by Google

Main use cases: Model for speech generation, which can be used for translations, text summaries, sentiment analysis or intent recognition. The quality of language generation lags behind larger, more modern models, while intent recognition, for example, is similarly good.

Input length: 512 tokens (approx. 384 words) is basic, up to 2048 tokens (approx. 1536 words) trained

Languages: English, French, Romanian, German

Model size: ~3 billion parameters

Input length: 512 tokens (approx. 384 words) is basic, up to 2048 tokens (approx. 1536 words) trained

Languages: English, French, Romanian, German

Model size: ~3 billion parameters

Test results

Use case: intent detection

Quality

Response time

In our tests for recognizing requests, we found very short response times with an average value of 0.1 seconds per email. The model is therefore also suitable for real-time applications.

Median: 0.10 sec.
Mean: 0.12 sec.
Minimum: N/A
Maximum: N/A

Expenses

This model was run locally on our servers, so there were no direct costs. In practice, the price depends very much on the setup and the hardware used. In general, larger models are more expensive than smaller ones: Google-FLAN-T5-XL can be considered large with a size of ~3 billion parameters.

Hosting

Local Hosting possible, GPU needed

Recommendation

Due to the good quality, the very short response times and the possibility to host the model ourselves, we can give a clear product recommendation for this model if the recognition of concerns in German-language customer emails is desired. This is especially true if the response time is relevant.