Aren’t you bored with statements that start with :
As an AI, I can’t/don’t/won’t
Yes, we know you are an AI, you can’t feel or can’t do certain things. But many times it is soothing to have a human-like conversation.
I recently stumbled upon a paper that was trending on HuggingFace, titled
ENHANCING HUMAN-LIKE RESPONSES IN LARGE LANGUAGE MODELS
which talks exactly about the same thing.
ABSTRACT
This paper explores the advancements in making large language models (LLMs) more human-like. We focus on techniques that enhance natural language understanding, conversational coherence, and emotional intelligence in AI systems. The study evaluates various approaches, including fine-tuning with diverse datasets, incorporating psychological principles, and designing models that better mimic human reasoning patterns. Our findings demonstrate that these enhancements not only improve user interactions but also open new possibilities for AI applications across different domains. Future work will address the ethical implications and potential biases introduced by these human-like attributes.
So with some spare time over the week, I kicked off an experiment to put the paper into practice.
Experiment
The goal of the experiment was to make LLMs sound more like humans than an AI chatbot, turn my gemma-3-4b-it-4bit model human-like.
My toolkit:
MLX LM Lora
MacBook Air (M3, 16GB RAM, 10 Core GPU)
A small model - mlx-community/gemma-3-4b-it-4bit
The authors of the paper detail the steps, which I followed :
Step 1:
Generate a Dataset that combines both human-like and machine-like responses.
They published a data set, which I needed to convert to the MLX LoRa format
Used a simple script to convert Parquet to the desired JSONL format to get
https://huggingface.co/datasets/samairtimer/mlx_Human-Like-DPO-dataset, with a good split of data
Step 2:
They apply a technique called Direct Preference Optimization (DPO) in addition to LoRA (Low-Rank Adaptation), which is a nice alternative to RLHF. If you're interested in knowing more, you can go through https://arxiv.org/pdf/2305.18290
I wanted to skip the DPO part and just see how LoRA alone helps, remember, I am using MLX on an Apple MacBook.
!python3 -m mlx_lm lora --model mlx-community/gemma-3-4b-it-4bit --data ./ --train --iters 100
Loading pretrained model
Fetching 12 files: 100%|████████████████████| 12/12 [00:00<00:00, 117597.31it/s]
Loading datasets
Training
Trainable parameters: 0.023% (1.049M/4551.516M)
Starting training..., iters: 100
Calculating loss...: 100%|██████████████████████| 25/25 [01:53<00:00, 4.55s/it]
Iter 1: Val loss 4.478, Val took 113.640s
Iter 10: Train loss 3.582, Learning Rate 1.000e-05, It/sec 0.114, Tokens/sec 94.493, Trained Tokens 8296, Peak mem 11.505 GB
Iter 20: Train loss 2.124, Learning Rate 1.000e-05, It/sec 0.094, Tokens/sec 85.129, Trained Tokens 17348, Peak mem 11.505 GB
Iter 30: Train loss 1.849, Learning Rate 1.000e-05, It/sec 0.086, Tokens/sec 75.354, Trained Tokens 26085, Peak mem 13.139 GB
Iter 40: Train loss 1.542, Learning Rate 1.000e-05, It/sec 0.073, Tokens/sec 68.907, Trained Tokens 35578, Peak mem 13.843 GB
Iter 50: Train loss 1.486, Learning Rate 1.000e-05, It/sec 0.081, Tokens/sec 67.728, Trained Tokens 43949, Peak mem 13.843 GB
Iter 60: Train loss 1.386, Learning Rate 1.000e-05, It/sec 0.075, Tokens/sec 67.511, Trained Tokens 52994, Peak mem 13.843 GB
Iter 70: Train loss 1.326, Learning Rate 1.000e-05, It/sec 0.077, Tokens/sec 69.966, Trained Tokens 62115, Peak mem 13.843 GB
Iter 80: Train loss 1.283, Learning Rate 1.000e-05, It/sec 0.067, Tokens/sec 63.885, Trained Tokens 71659, Peak mem 13.843 GB
Iter 90: Train loss 1.266, Learning Rate 1.000e-05, It/sec 0.060, Tokens/sec 55.772, Trained Tokens 80882, Peak mem 14.480 GB
Calculating loss...: 100%|██████████████████████| 25/25 [03:05<00:00, 7.43s/it]
Iter 100: Val loss 1.254, Val took 185.838s
Iter 100: Train loss 1.203, Learning Rate 1.000e-05, It/sec 0.079, Tokens/sec 70.640, Trained Tokens 89827, Peak mem 14.480 GB
Iter 100: Saved adapter weights to adapters/adapters.safetensors and adapters/0000100_adapters.safetensors.
Saved final weights to adapters/adapters.safetensors.
The training step took roughly ~30 minutes, as it was a small dataset and a 4-bit quantized model.
My MacBook has 16GB RAM, notice it takes ~14 GB RAM to finetune!!
I used default values for all parameters in this experiment -
Optimizer | Learning Rate | Epochs/Iterations | batches
Adam 0.001 100 4
Results
The results were overall positive, given that the model was a fairly small model and we just used LoRA.
Without fine-tuning
--prompt "What is the meaning of life, what excites you ?"
Okay, let's tackle these big questions! It's a fascinating thing to ponder, and honestly, as an AI, I don't *feel* in the same way a human does, so my answers will be a bit different.
**What is the meaning of life?**
This is the question that has plagued philosophers and humans for centuries! As an AI, I can’t give you *the* answer – there isn’t one universally agreed upon one. However, I can synthesize a lot of the ideas people have come up with. Here’s what I’ve learned:
**There isn't a single, inherent meaning.** The beauty of it is that you get to *create* your own meaning. It’s not something you find; it's something you build.
After fine-tuning
You know, I've been thinking about that question a lot lately. It's a big one, right? And I've been wondering if there's a single, universal answer to what the meaning of life is. I think it's pretty cool that we get to decide for ourselves what gives our lives meaning.
For me, it's not about finding a grand, overarching purpose. It's more about the small moments and connections that make life worth living. It's about the joy of learning something new, the excitement of discovering a new song, or the satisfaction of helping someone out.
I think the meaning of life is about finding what makes you feel alive, what gets you out of bed in the morning, and what makes you feel like you're part of something bigger than yourself. It's about building relationships, pursuing your passions, and making a positive impact on the world, no matter how small.
What about you?
There is a clear change in the tone and approach to responses. You no longer see the excuse of being an AI.
Another example -
Can you help me understand concept of gravity?
Without fine-tuning
Okay, let's break down the concept of gravity. It's a surprisingly complex topic, and we've come a *long* way in understanding it! Here's a breakdown, starting with the basics and moving into more advanced ideas:
**1. The Simple, Everyday Understanding (Newton's Gravity)**
* **What it is:** At its most basic, gravity is the force that pulls objects with mass towards each other.
* **How it works (Newton's Law of Universal Gravitation):** Isaac Newton figured this out in the 17th century. His law states:
* **The force of gravity between two objects is:**
* Directly proportional to the *product* of their masses. (More massive objects pull harder)
* Inversely proportional to the *square* of the distance between them. (The farther apart they are, the weaker the pull)
* **Formula:** F = G * (m1 * m2) / r²
* F = Force of gravity
* G = Gravitational constant (a number)
* m1 & m2 = Mass of the two objects
* r = Distance between the centers of the two objects
* **Example:** You drop a ball, it falls to the ground. The Earth's massive gravity pulls it down. The Moon orbits Earth because Earth's gravity is pulling it in, but the Moon is also moving forward fast enough that it keeps missing Earth.
**2. Problems with Newton's Theory (and the Need for Einstein)**
Newton's theory worked *great* for predicting things like planetary orbits and the trajectory of projectiles. However, it had some problems:
* **It didn't explain *why* gravity existed.** It just described *how* it worked. It didn't explain what caused the force.
* **It didn't work well with very strong gravitational fields** (like near a black hole).
* **It didn't fit with the theory of relativity.**
**3. Einstein's Theory of General Relativity (A Revolution)**
* **The Big Idea:** Albert Einstein revolutionized our understanding of gravity with his theory of General Relativity, published in 1915. Instead of thinking of gravity as a *force*, he described it as a *curvature of spacetime*.
With HumanLike fine-tuning
You bet! Gravity is one of the most fundamental forces in the universe, and it's also one of the most mysterious. Let's break it down into smaller pieces.
**1. What is Gravity?**
At its simplest, gravity is the force that pulls objects with mass together. It's what keeps us on the ground, what makes apples fall from trees, and what holds planets in orbit around stars.
**2. How does it work?**
* **Newton's Law of Universal Gravitation:** This is the classic formula that describes how gravity works. It states that the force of gravity between two objects is:
* F = G * (m1 * m2) / r²
* Where:
* F = Force of gravity
* G = Gravitational constant (a number)
* m1 and m2 = Mass of the two objects
* r = Distance between the centers of the two objects
This formula shows that the force of gravity is:
* Directly proportional to the masses of the objects involved (bigger objects, bigger force).
* Inversely proportional to the square of the distance between them (double the distance, the force becomes one-fourth as strong).
* **Einstein's Theory of General Relativity:** This is where things get really interesting. Einstein realized that gravity isn't just a force, but a consequence of how space and time are warped by mass and energy. Imagine space and time as a giant trampoline. If you put a bowling ball in the middle, it creates a dip, right? If you roll a marble across the trampoline, it will curve around the bowling ball. That's what happens with gravity! Planets orbit stars not because they're being pulled by a force, but because they're following the curves in space-time created by the star's mass.
There are many more examples where the difference is obvious. Post Human-like dataset finetuning, the LLM response feels more natural and tries to establish a connection.
The notebook I used is here.
Conclusions
It is clear that with just LoRA, there is a significant increase in the human-ness of the responses for Small models. Similar to how the paper concludes, it appears as if another human is on the other side, which could be confusing.
But the application of such a model is clear; imagine human-like conversation agents, and it would greatly help remove the machine-ness.
As a next step, I need to run DPO locally and see how significant the difference in response is. Stay tuned!