Inkit Padhi
ML & NLP Researcher @ IBM Research, New York
Hi! My name is Inkit. I’m a researcher interested in fields of Machine Learning (ML) and Natural Language Processing (NLP). My current work centers around improving the safety, reliability, and trustworthiness of Large Language Models (LLMs).
Currently, my primary focus and interests lie in safety/alignment, synthetic data generation, steering and influence-based attributions within LLMs.
My past research has encompassed various areas of deep learning, including but not limited to, learning representations for diverse modalities, counterfactual generation, interpretability, text style transfer, unsupervised learning, and more. I began my research journey at USC/ISI under the guidance of Kevin Knight; our work on probing in sequence models established a foundational contribution in the field of interpretability.
Email: $first_name.$last_name@gmail.com
Updates
Dec, 2024 | Attending NeurIPS ‘24 to present our work at the Pluralistic Alignment workshop. |
---|---|
Dec, 2024 | Our paper “Granite Guardian” is now available as a preprint on arXiv. |
Nov, 2024 | I’ll be presenting our work “Value Alignment From Unstructured Text” at EMNLP 2024 |
Oct, 2024 | Granite Guardian 3.0 is out! It helps detect input and response risks, including various harm and RAG hallucinations. |
Sep, 2024 | CAST: Checkout my exceptional summer intern, Bruce Lee’s, work on conditional activation steering. |
Aug, 2024 | Alignment Studio is accepted to IEEE Internet Computing! We introduce an architecture that facilitates alignment of LMs to specific values, norms and regulations within a context. |