PhD Studentship: Editable and Traceable Language Models for Accountable Human-AI Interaction

Employer: University of East Anglia

Location: Norwich, England, United Kingdom

Salary: £20,408

Job type: Full Time,Fixed-Term/Contract

Posted: 2026-04-28T00:00:00Z

Sector: Education & Training

Job Description

Project supervisor – Dr Farhana Ferdousi Liza Language processing in humans and deep language models share underlying computational principles. Mechanisms for updating deep large language models, such as editing memory in a transformer to replace harmful information or inject specialised knowledge, offer new promise for designing safe, secure, and accountable artificial intelligence. However, most current high-capacity language models are typically accessed only via pretrained models, API calls, or web interfaces (e.g., ChatGPT). While convenient, this approach limits researcher’s ability to inspect or modify a model's internal behaviour and prevent the deployment of accountable models in sensitive domains. Consequently, critical questions about how language models acquire knowledge, store memory, exhibit bias, or fail (e.g., hallucination, misaligned content generation), remain scientifically unanswered. This PhD project will address one of these critical questions. You will develop, train, and evaluate language models including transformer-based, and retrieval-augmented generative models from the ground up using high-performance computing (e.g., NVIDIA RTX 6000 ADA 48GB 4DP Graphics) and specialise datasets (e.g., parent-children interaction language). You will then evaluate your models along one or more dimensions of responsible AI, such as: safety (harmful outputs, unintended behaviours, and jailbreaks), security (robustness to adversarial inputs and data poisoning), and accountability (tracing outputs back to training data or internal representations). Finally, you will deploy your models on an embodied AI system or social robot (e.g., Furhat Robots) and conduct human-AI interaction experiments to identify where, why, and how these models succeed or fail in real-time, face-to-face conversations. This study will deliver a grounded architecture for reliable and trustworthy language models suitable for deployment in sensitive domains such as education and healthcare. The School of Computing Sciences ( https://www.uea.ac.uk/about/school-of-computing-sciences ) provides a vibrant research environment for conducting Computing and allied research and training. We collaborate with multi-national companies such as Apple, BT, the National Trust and Aviva, research institutes in the Norwich Research Park ( https://www.norwichresearchpark.com ), as well as other universities and industries in the UK and overseas. We are also members of the Turing University Network, a group of 65 UK universities working together to advance world-class research and build skills for the future. The successful candidate will also be expected to contribute to Tutor activities for laboratory support on our BSc and MSc Courses in Artificial Intelligence, Data Science, and Computing Sciences commensurate with their core expertise, within the working hours permitted for full-time Postgraduate Researchers. Entry requirements The standard minimum entry requirement is 2:1 in Computer Science or related subject area. Mode of study: Full-time Start date: 1 October 2026 Funding Details This PhD project is in a competition for a funded studentship. Funding comprises ‘Home’ tuition fees, an annual tax-free maintenance stipend (2026/27 rate £20,408) for a maximum of 3 years, and £2,000 per annum to support research training activities.

Apply on jobs.ac.uk

Helpful guides: Getting Into Education and Training · 5 Things Schools Look For When Hiring Support Staff