Large language model topic modelling

Extract topics and summarise outputs using Large Language Models (LLMs, Gemma 3 4b/GPT-OSS 20b if local (see tools/config.py to modify), Gemini, Azure, or AWS Bedrock models (e.g. Claude, Nova models). The app will query the LLM with batches of responses to produce summary tables, which are then compared iteratively to output a table with the general topics, subtopics, topic sentiment, and a topic summary. Instructions on use can be found in the README.md file. You can try out examples by clicking on one of the example datasets under 'Test with an example dataset' at the bottom of the page, which will show you example outputs from a local model run. API keys for AWS, Azure, and Gemini services can be entered on the settings page (note that Gemini has a free public API).

NOTE: Large language models are not 100% accurate and may produce biased or harmful outputs. All outputs from this app absolutely need to be checked by a human to check for harmful outputs, hallucinations, and accuracy.

Choose a tabular data file (xlsx, csv, parquet) of open text to extract topics from.

LLM model
Select the open text column of interest. In an Excel file, this shows columns across all sheets.
Select the open text column to group by
Force responses into zero shot topics
Ask the model to assign responses to only a single topic
Ask the model to produce structured summaries using the zero shot topics as headers rather than extract topics
Choose sentiment categories to split responses

Test with an example dataset

Examples