Posted on May 30

How We Tried to Make AI Genuinely Useful for Doctors (Without Being Annoying)

We didn’t set out to revolutionize medicine or build a magic black box. Mostly, we wanted to make things less painful for the people keeping hospitals running. That meant doctors, nurses, front desk staff—everyone trying to make decisions while being pulled in five directions.

So we asked: Can AI help, quietly?

What Actually Happens Inside a Hospital

If you've never worked in or around one, you might assume hospitals are organized. They're not. Not in the way software folks think.

There are alarms. Whiteboards covered in scribbles. Four different systems that don’t talk to each other. And too many sticky notes taped to monitors.

Everyone’s improvising. Constantly. So any “smart system” has to respect that mess.

The First Thing We Tried (and Honestly, It Wasn’t Great)

We thought we’d start simple. Patients come in with symptoms, right? So we built a small model that could suggest a few likely diagnoses based on their initial complaints.

Was it accurate? Sometimes. But the point wasn’t to be right—it was to reduce that blank stare when there’s too little context and too much noise.

Here’s a rough version from early testing:

from sklearn.feature_extraction.text import TfidfVectorizer from sklearn.linear_model import LogisticRegression symptoms = [ "chest pain and breathlessness", "sore throat and high fever", "abdominal pain with nausea" ] diagnoses = ["cardiac issue", "flu", "gastritis"] vectorizer = TfidfVectorizer() X = vectorizer.fit_transform(symptoms) clf = LogisticRegression() clf.fit(X, diagnoses) input_symptom = ["fever and sore throat"] print(clf.predict(vectorizer.transform(input_symptom)))

`
It wasn’t magic. But a few nurses told us, “Hey, that saved me 30 seconds.” That’s a win.

Then We Got Curious: Could It Recommend Tests?

Doctors tend to order certain tests together. CBC and CRP. X-ray and ECG. So we wondered—if a doctor orders one, can we gently surface the others that are usually linked?

We didn’t want to prompt. Just… nudge.

`python
import pandas as pd
from mlxtend.frequent_patterns import apriori, association_rules

df = pd.DataFrame([
{'CBC': 1, 'X-Ray': 1, 'MRI': 0},
{'CBC': 1, 'X-Ray': 0, 'MRI': 1},
{'CBC': 1, 'X-Ray': 1, 'MRI': 1},
])

frequent = apriori(df, min_support=0.5, use_colnames=True)
rules = association_rules(frequent, metric="confidence", min_threshold=0.6)
print(rules[['antecedents', 'consequents']])
`

It worked quietly in the background. And again—not everyone used it. But those who did? They never wanted to go back.

The One That Made Everyone Nervous: Predicting Risk from Vitals

This part was touchy.

Vitals come in every few minutes. Most of them look normal—until they don’t. We figured: what if we ran a model on that stream and raised a flag when something looked off?

Not a red alert. Just a heads-up.

`python
import xgboost as xgb
import numpy as np

X = np.array([[103, 88, 23, 38.2]]) # HR, BP, Resp, Temp
model = xgb.XGBClassifier()
model.load_model("risk_model.json")

prob = model.predict_proba(X)[0][1]
if prob > 0.72:
print("Potential deterioration. Suggest nurse check-in.")
`

It freaked some people out at first. We toned it down. Added context. Allowed them to ignore it. Eventually, it found its place.

What Really Helped? Reordering the Dashboard

Forget AI predictions. What helped most was simply sorting the patient list better.

We created a scoring function. Risk + test delays + unresolved meds + how long since the last note. Whoever ranked highest floated to the top.

No machine learning. No buzzwords. Just relevance for the OPD software.

Doctors started noticing without being told. That’s the best kind of feature.

The Unexpected Learning Loop

We added logging just to be safe—turns out, it became a feedback goldmine.

Every time someone ignored a suggestion or changed course, we kept track. Over time, it helped us understand where the models were being too eager, or where they were actually helpful.

More importantly, it reminded us that AI doesn’t need to be right—it just needs to get out of the way when it’s wrong.

Privacy Isn’t a Feature, It’s a Given

No model touches patient data without being audited. Full stop.

Every prediction is logged
Every override is tracked
Every field is role-gated

If a patient’s safety is involved, there are no shortcuts. We learned that quickly. And the hard way, sometimes.

So, Was It AI? Sure. But It Was Mostly Common Sense.

At the end of the day, the things that stuck weren’t the smartest or most complex.

It was:

Sorting patients by priority
Nudging for tests
Reducing clicks
Being quiet

Sometimes the best AI feels like... nothing at all. Just things working slightly better than before- just like what we did for our hospital management software NZCares.

If you’re building for hospitals or anywhere real people rely on fast decisions—avoid the shiny. Stick to the useful. And if it gets ignored? That’s feedback, not failure.

We’re still figuring it out. Would love to hear how others are, too.