Roleplaying with Structure: Synthetic Therapist-Client Conversation Generation from Questionnaires

Technical University of Darmstadt1, Philipps-University Marburg2,
Justus Liebig University Giessen3, University of Münster4,
ELLIS Institute Finland5, University of Turku6

*Work done while with TU Darmstadt

Abstract

Large Language Models (LLMs) are promising tools for synthetic data generation in mental health, yet previous work is mostly limited to generic seed information because of privacy regulations. We present SQPsych (Structured Questionnaire-based Psychotherapy), a pipeline for generating synthetic therapist-client conversations. SQPsych uses real structured client profiles and psychological questionnaires to generate synthetic corpora, SQPsychConv. We then fine-tune seven open-weight LLMs and evaluate them using automatic benchmarks and trained psychotherapists. Our fine-tuned models remain competitive with baselines on surface-level counseling benchmarks and almost always outperform the previous mental-health state of the art. Expert evaluation further shows that SQPsych significantly improves LLMs’ ability to roleplay therapists, and experts consistently prefer therapy sessions generated by our models over those from other mental-health-oriented LLMs. We release our code, fine-tuned SQPsychLLM models, and synthetic corpora at https://ai-mh.github.io/SQPsych.html.

SQPsychConv Datasets

We provide several variations of the SQPsychConv dataset, generated by different large language models. The finetuned versions represent a larger, more diverse corpus. All datasets are available on Hugging Face.

🤗 Dataset Conversations Generating Model
AIMH/SQPsychConv_qwq 2.09k Qwen/QwQ-32B
AIMH/SQPsychConv_nemotron 2.09k nvidia/Llama-3_3-Nemotron-Super-49B-v1
AIMH/SQPsychConv_llama3 2.09k meta-llama/Llama-3.3-70B-Instruct
AIMH/SQPsychConv_qwen-2.5 2.09k Qwen/Qwen2.5-72B-Instruct
AIMH/SQPsychConv_mistral 2.09k mistralai/Mistral-Large-Instruct-2407
AIMH/SQPsychConv_command 2.09k CohereLabs/c4ai-command-a-03-2025
AIMH/SQPsychConv_gemma 2.09k google/gemma-3-27b-it
AIMH/SQPsychConv_qwq_no_questionnaire 2.09k Qwen/QwQ-32B
AIMH/SQPsychConv_nemotron_no_questionnaire 2.09k nvidia/Llama-3_3-Nemotron-Super-49B-v1
AIMH/SQPsychConv_llama3_no_questionnaire 2.09k meta-llama/Llama-3.3-70B-Instruct
AIMH/SQPsychConv_qwen-2.5_no_questionnaire 2.09k Qwen/Qwen2.5-72B-Instruct
AIMH/SQPsychConv_mistral_no_questionnaire 2.09k mistralai/Mistral-Large-Instruct-2407
AIMH/SQPsychConv_command_no_questionnaire 2.09k CohereLabs/c4ai-command-a-03-2025
AIMH/SQPsychConv_gemma_no_questionnaire 2.09k google/gemma-3-27b-it
AIMH/SQPsychConv_command_finetune 29.2k CohereLabs/c4ai-command-a-03-2025 (Finetuning format)
AIMH/SQPsychConv_gemma_finetune 32.4k google/gemma-3-27b-it (Finetuning format)
AIMH/SQPsychConv_llama3_finetune 47.7k meta-llama/Llama-3.3-70B-Instruct (Finetuning format)
AIMH/SQPsychConv_qwen-2.5_finetune 29.1k Qwen/Qwen2.5-72B-Instruct (Finetuning format)
AIMH/SQPsychConv_qwq_finetune 34.9k Qwen/QwQ-32B (Finetuning format)
AIMH/SQPsychConv_mistral_finetune 46k mistralai/Mistral-Large-Instruct-2407 (Finetuning format)
AIMH/SQPsychConv_nemotron_finetune 29k nvidia/Llama-3_3-Nemotron-Super-49B-v1 (Finetuning format)
AIMH/SQPsychConv_command_no_questionnaire_finetune 33.7k CohereLabs/c4ai-command-a-03-2025 (Finetuning format)
AIMH/SQPsychConv_gemma_no_questionnaire_finetune 31.8k google/gemma-3-27b-it (Finetuning format)
AIMH/SQPsychConv_llama3_no_questionnaire_finetune 45.7k meta-llama/Llama-3.3-70B-Instruct (Finetuning format)
AIMH/SQPsychConv_qwen-2.5_no_questionnaire_finetune 37.8k Qwen/Qwen2.5-72B-Instruct (Finetuning format)
AIMH/SQPsychConv_qwq_no_questionnaire_finetune 34.9k Qwen/QwQ-32B (Finetuning format)
AIMH/SQPsychConv_mistral_no_questionnaire_finetune 37.3k mistralai/Mistral-Large-Instruct-2407 (Finetuning format)
AIMH/SQPsychConv_nemotron_no_questionnaire_finetune 27.8k nvidia/Llama-3_3-Nemotron-Super-49B-v1 (Finetuning format)

SQPsychLLM Models

We also release the 8B parameter SQPsychLLM models, finetuned on the synthetic conversations from the datasets above.

🤗 Model Size Training Data
AIMH/SQPsychLLM-8b-qwen-2.5 8B SQPsychConv (Qwen 2.5)
AIMH/SQPsychLLM-8b-mistral 8B SQPsychConv (Mistral)
AIMH/SQPsychLLM-8b-gemma 8B SQPsychConv (Gemma)
AIMH/SQPsychLLM-8b-qwq 8B SQPsychConv (Qwen/QwQ)
AIMH/SQPsychLLM-8b-command 8B SQPsychConv (Command R)
AIMH/SQPsychLLM-8b-llama3.3 8B SQPsychConv (Llama 3.3)
AIMH/SQPsychLLM-8b-nemotron 8B SQPsychConv (Nemotron)
AIMH/SQPsychLLM-8b-gemma-no_questionnaire 8B SQPsychConv (Gemma) No Questionnaires
AIMH/SQPsychLLM-8b-command-no_questionnaire 8B SQPsychConv (Command R) No Questionnaires
AIMH/SQPsychLLM-8b-gemma-Qwen 8B SQPsychConv (Gemma) (Finetune on Qwen2.5-7B-Instruct)
AIMH/SQPsychLLM-8b-gemma-Qwen_no_questionnaire 8B SQPsychConv (Gemma) No Questionnaires (Finetune on Qwen2.5-7B-Instruct)

Dataset Statistics

Dataset statistics comparing our approach to previous works on mental health counseling.

Dataset # Utt. # Avg. turns # Tok./utt.
CACTUS 995,512 15.263 27.051
Psych8k 16,374 1 54.685
SQPsychConv (command) 64,760 17.451 51.019
SQPsychConv (gemma) 71,000 16.999 51.790
SQPsychConv (nemotron) 64,238 15.911 51.432
SQPsychConv (mistral) 98,342 23.119 31.098
SQPsychConv (llama3.3) 101,694 24.599 32.627
SQPsychConv (qwen2.5) 64,488 15.534 34.489
SQPsychConv (qwq) 77,134 18.601 26.291

BibTeX

@article{vu2025roleplayingstructuresynthetictherapistclient,
      title={Roleplaying with Structure: Synthetic Therapist-Client Conversation Generation from Questionnaires}, 
      author={Doan Nam Long Vu and Rui Tan and Lena Moench and Svenja Jule Francke and Daniel Woiwod and Florian Thomas-Odenthal and Sanna Stroth and Tilo Kircher and Christiane Hermann and Udo Dannlowski and Hamidreza Jamalabadi and Shaoxiong Ji},
      year={2025},
      journal={arXiv preprint arXiv:2510.25384},
      url={https://arxiv.org/abs/2510.25384}, 
}