InData Science CollectivebyRachel Draelos, MD, PhDHealthBench Does Not Evaluate Patient SafetyHealthBench is a recently released benchmark to evaluate large language models in healthcare. This blog post summarizes what HealthBench…5d ago1
Joshua AnangThe math and logic behind ChatGPT. This paper is all you need.I’m Joshua Anang and I was 17 years old (last year) when I built my own version of ChatGPT. I’m writing this paper to explain the math…May 7
InData Science CollectivebyMarcus K. ElwinTen Lessons from a Year Building AI Agents in LegalTechAn AI engineer’s journey optimizing legal workflows with lessons learned from building, deploying, and maintaining intelligent agents.1d ago21d ago2
InThe Quantastic JournalbyRob MansonInside a Language Model’s Mind: Curved Inference as a New “AI Interpretability” ParadigmNew Evidence of the Shape of ThoughtMay 1117May 1117
InData Science CollectivebyFlorin AndreiTrain LLMs to Talk Like You on Social Media, Using Consumer HardwareUse your own comments on social media to fine-tune an LLM, and run all fine-tuning on (relatively) inexpensive hardware.May 105May 105
InData Science CollectivebyRachel Draelos, MD, PhDHealthBench Does Not Evaluate Patient SafetyHealthBench is a recently released benchmark to evaluate large language models in healthcare. This blog post summarizes what HealthBench…5d ago1
Joshua AnangThe math and logic behind ChatGPT. This paper is all you need.I’m Joshua Anang and I was 17 years old (last year) when I built my own version of ChatGPT. I’m writing this paper to explain the math…May 7
InData Science CollectivebyMarcus K. ElwinTen Lessons from a Year Building AI Agents in LegalTechAn AI engineer’s journey optimizing legal workflows with lessons learned from building, deploying, and maintaining intelligent agents.1d ago2
InThe Quantastic JournalbyRob MansonInside a Language Model’s Mind: Curved Inference as a New “AI Interpretability” ParadigmNew Evidence of the Shape of ThoughtMay 1117
InData Science CollectivebyFlorin AndreiTrain LLMs to Talk Like You on Social Media, Using Consumer HardwareUse your own comments on social media to fine-tune an LLM, and run all fine-tuning on (relatively) inexpensive hardware.May 105
Jason ClarkHow AI Models Fake Alignment and Why You Should CareI remember watching “The Usual Suspects” for the first time — it’s one of those movies where you can only truly enjoy it the first time…May 819
May ReeseWhen Algorithms Judge: What we Learned from Examining LLM Decision-Making in the Legal DomainThis post describes the process and results of a 72-hour research sprint project. We explore LLM performance on legal decision-making.May 75
InLeading EDJEbyMatt ElandReference Architecture for AI Developer ProductivityReference Architecture for AI Developer ProductivityMay 64