Andon
Andon is the visual + audible signal that pulls support to a production line the moment an operator hits a problem — a missing part, a quality call, a safety hazard, an equipment alarm, a changeover overrun. Originating at Toyota as the cord that any operator could pull to stop the line, it is the foundation of jidoka (autonomation with a human touch) and the operational expression of stop-the-line authority. In a modern MES it is a real-time digital channel: the operator presses a tile, the right responder is paged in seconds, the resolution clock runs, and every event lands as a row in the OEE / downtime / quality ledger.
01What andon actually is
Andon (行灯) is the Japanese word for a paper lantern. At Toyota in the 1950s, Taiichi Ohno and Eiji Toyoda hung overhead lanterns above each line section that lit up red when an operator pulled a cord; the line stopped, the team leader ran to the station, the problem was solved at root cause, and only then did the line restart. The principle: a defect produced is a defect that will be shipped, and the cheapest defect is the one you stop the line for.
Six decades later andon has evolved from a literal cord and lamp into a real-time digital channel — a kiosk tile that pages the right responder by role, computes the response and resolution clocks, captures the root cause, and feeds the OEE ledger and (when warranted) the deviation / CAPA pipeline. The principle is identical: surface problems instantly, attack them at root cause, never let them go silently.
02Andon inside jidoka
Andon is one of the two operational pillars of jidoka — the Toyota Production System concept usually translated as "autonomation" or "automation with a human touch". Jidoka has two halves: (1) machines that detect their own abnormalities and stop themselves (poka-yoke + auto-stop); and (2) humans who have the authority — and the obligation — to stop the line the moment they see an abnormality. Andon is the channel through which (2) happens.
The cultural prerequisite is often underestimated: andon only works if a stop is celebrated, not punished. When a Western plant retrofits andon onto a culture that punishes downtime, operators simply do not pull the cord. The result is shipped defects + a dashboard that says everything is fine.
03Andon categories
Best-practice digital andon presents the operator with a small, fixed set of reason codes — typically five — each colour-coded and each routing to a different responder. The codes are deliberately few so the operator's cognitive load is near-zero and the response time stays in seconds.
| Code | Colour (ANSI Z535) | Trigger | Default responder | Promote-to |
|---|---|---|---|---|
| MATERIAL | Blue | Missing component, wrong material, stock-out at line-side | Material handler / WMS coordinator | WMS replenishment task |
| QUALITY | Yellow | Defect, suspect material, out-of-spec reading | QA technician | Deviation if material is rejected |
| EQUIPMENT | Red | Equipment fault, alarm, jam, calibration overdue | Maintenance technician | Maintenance work order |
| SAFETY | Red + audible | Injury, near-miss, hazard, environmental release | EHS officer + first-aider | EHS incident report |
| CHANGEOVER / SUPPORT | Green | Changeover help, line-clearance approval, training question | Team leader / supervisor | Coaching note / training record |
The colour scheme is not arbitrary — it follows ANSI Z535 / ISO 3864 safety-signalling conventions so the meaning is unambiguous across language barriers and across plant-floor noise levels. Red = immediate hazard or stop; yellow = caution / quality; blue = action required (material); green = informational / changeover; flashing + audible = safety override.
04The andon event lifecycle
- Trigger — operator taps the andon tile on the kiosk, selects the category. Optional 1-line note. The line / cell is marked andon-active.
- Acknowledgement — the system pages the responder by role (push notification on their device, page on the supervisor dashboard, overhead light on the andon board). The responder taps Acknowledge — the acknowledgement clock stops.
- Resolution — the responder is on the line. They diagnose, act, resolve. They tap Resolve, select a root-cause code from a short list, and add a 1-line note. The resolution clock stops.
- Categorisation — the system writes the event to the downtime / quality ledger with: line, cell, WO, operator, responder, category, root-cause, acknowledgement time, resolution time, total impact.
- Promote-or-close — if the event meets the threshold (e.g. >15 min downtime, quality reject, safety event) it is auto-promoted to the deviation / maintenance / EHS pipeline with all context pre-filled. Otherwise it closes silently.
- Pattern detection — overnight, the system clusters andon events by line × cell × category × root-cause and surfaces clusters that exceed a threshold (e.g. >5 events of the same root-cause in 7 days) into the CAPA backlog as a candidate problem statement.
05The KPIs andon feeds
Andon is the data source for several ISO 22400 KPIs and several operational metrics that do not have a formal standard yet but are universally used.
| KPI | Definition | Source |
|---|---|---|
| MTTA — Mean Time To Acknowledge | Average time from andon trigger to responder acknowledgement | Acknowledgement clock |
| MTTR — Mean Time To Resolve | Average time from acknowledgement (or trigger) to resolution | ISO 22400-2 §6 — extended to all andon events not only equipment |
| Andon rate | Number of andon events per shift / per million produced units | Event count ÷ output |
| Downtime attributed | Sum of resolution clocks across equipment + material categories | ISO 22400-2 — feeds Availability and OEE |
| Andon-to-deviation ratio | Fraction of QUALITY andons that promote to a regulated deviation | Promote-or-close decision |
| First-time-resolution rate | Fraction of andons resolved by the first-responder without escalation | Escalation flag |
MTTA + MTTR are the two metrics that most directly tell you whether your andon system is healthy. A plant where MTTA drifts above 3 minutes has either too few responders, too many false-positive triggers, or — most likely — responders who are not respecting the channel because the culture has slipped. Watch the trend, not the absolute number.
06Andon in a regulated plant
In a regulated plant — pharma, supplements, devices, food, cosmetics, radiopharm — andon plays an additional role: it is the system of record that proves an issue was raised, was acknowledged in a reasonable time, was resolved, and (when severity warranted) was promoted to a formal deviation / NCR / maintenance WO with an audit trail.
- Audit trail — every andon event is an Annex 11 §12 audit-trail entry: who triggered, when, what category, who acknowledged, who resolved, what root cause, what promotion decision.
- 21 CFR 211.192 alignment — production-record review must include all deviations and unexplained discrepancies. QUALITY andons that did not promote to a deviation should leave a documented why-not so the reviewer can confirm the judgement.
- ALCOA+ — andon events are Attributable (operator + responder), Legible (kiosk capture, no handwriting), Contemporaneous (timestamps are server-issued), Original (kiosk is system of record), Accurate (root-cause from validated picklist). The +Complete / Consistent / Enduring / Available dimensions are platform-provided.
- GxP linkage — andon → deviation, andon → maintenance WO, andon → EHS incident, andon → CAPA — each promotion carries the andon event ID so the downstream record has a back-reference to the original moment-of-truth signal.
07Process vs discrete: terminology and triggers
| Dimension | Process manufacturing (pharma / food / chemicals) | Discrete manufacturing (devices / consumer / auto) |
|---|---|---|
| Trigger source | Operator + process-historian alarm forwards (high temperature, low flow, off-spec IPQ) | Operator + PLC line-side fault forwards (jam, low-air, sensor fault) |
| Most-frequent category | QUALITY (IPQ out-of-range) + EQUIPMENT (cleaning / calibration / cIP) | MATERIAL (stockout) + EQUIPMENT (tool change, jam) + CHANGEOVER |
| Typical responder org | QA tech + Maintenance + Process Engineer | Material handler + Maintenance + Line Leader |
| Promote-to record | Deviation under 211.192 / 211.100 + Change Control if recurring | NCR (820.90 / ISO 9001 §10.2) + CAPA |
| KPI emphasis | Yield loss + RFT + deviation count | OEE Availability + MTTR + scrap rate |
08Common mistakes
Mistake 1 — too many categories
Twelve categories sounds thorough; in practice operators stop using the system because picking the right one takes too long. Stay at five. The structured root-cause picklist on resolution carries the granularity.
Mistake 2 — punishing pulls
If an operator gets the question "why did you stop the line?" instead of "thank you for catching that — what do you need?", the andon system is dead within a week. Reward the pull, debug the root cause.
Mistake 3 — no clock on acknowledgement
If the acknowledgement clock is not visible and not tracked, responders drift to 5–15 minute response times within months. The MTTA dashboard, visible to everyone, is the discipline.
Mistake 4 — andon as group chat
Routing andon events into a generic Teams / Slack channel that everyone ignores defeats the purpose. The responder must be identified by role on each line / cell, must be on-shift, must carry the page, and must acknowledge inside the system — not via reply-in-thread.
Mistake 5 — no auto-promote to deviation
QUALITY andons that resolve quickly often still indicate a regulated deviation occurred. If the system does not auto-promote (with a clear de-promote decision logged when QA disagrees), the regulated record is missing and the andon was a private workaround.
Mistake 6 — ignoring the pattern layer
The single biggest payoff from digital andon is the cluster view: same line, same cell, same root cause, five times this week. That pattern is a CAPA candidate. Without an overnight clustering job + dashboard, the data sits dead.
09Where V5 Ultimate fits
V5 ships digital andon as a first-class kiosk channel — five colour-coded categories, role-routed responders, live MTTA / MTTR clocks, auto-promotion into the regulated pipeline, and an overnight cluster job that feeds the CAPA backlog.
- Kiosk tile — every line / cell carries a single Andon tile at the top of the kiosk; one tap surfaces the five categories with ANSI-Z535 colour coding; second tap raises the event.
- Role routing — responders are configured per line × category × shift; the page lands on their phone / tablet + the supervisor dashboard + the overhead board within 5 seconds.
- Live clocks — MTTA and MTTR clocks count up on the kiosk, on the dashboard, and on the overhead board; resolution stops the clocks; everything is server-issued so timestamps are tamper-evident.
- Auto-promote — QUALITY andons that meet the configured threshold auto-create a draft deviation with the andon context pre-filled; EQUIPMENT andons auto-create a maintenance WO; SAFETY andons auto-create an EHS incident; the operator does not have to repeat themselves.
- OEE / 22400 linkage — every andon resolution clock writes to the downtime ledger with category + root-cause; the OEE Availability calculation and MTTR / MTBF KPIs are populated from the same event stream — no parallel data capture.
- Cluster view — an overnight job clusters events by line × cell × category × root-cause; clusters above the threshold appear in the CAPA backlog as candidate problem statements with the supporting event list one-click away.
- Mobile-safe — the responder workflow works on iPhone (≤390 px CSS width) with no horizontal scroll; the page → acknowledge → resolve loop is two taps.
- Audit-trail — every event is an Annex 11 §12 audit-trail entry, complete with promotion link to any downstream regulated record.
10Frequently asked questions
Is andon mandated by any regulation?
No regulation says "thou shalt have andon". But several regulated workflows — 21 CFR 211.192 production-record review, ISO 9001 §10.2 nonconformity control, ISO 13485 §8.3 nonconformity, OSHA accident-reporting timeliness — require timely capture and resolution of issues raised on the line. Digital andon is the cleanest evidence that the obligation is met.
How many categories should we have?
Five is the proven number — material, quality, equipment, safety, support / changeover. Adding more degrades response time more than it improves analysis. Keep the categories small, push the granularity into the root-cause picklist at resolution.
What's a healthy MTTA?
Under 2 minutes for QUALITY + EQUIPMENT, under 30 seconds for SAFETY, under 5 minutes for MATERIAL + SUPPORT. The trend matters more than the absolute number — drift upward is the first sign the culture is slipping.
Does the line have to actually stop when andon is triggered?
It depends on the criticality. In the original Toyota model the line stops on every pull. In modern plants with mixed criticality, MATERIAL + SUPPORT often do not stop the line (a parallel station continues), QUALITY + EQUIPMENT often do stop the affected cell, and SAFETY always stops the line. The decision is configured per category × line.
Can andon replace a deviation system?
No. Andon is the moment-of-truth signal; the deviation system is the regulated record + investigation + CAPA. Andon feeds deviation; it does not replace it. The auto-promote mechanism is what makes them work as one workflow.
Does andon make sense for low-volume / high-mix production?
Yes — arguably more, because the cost of a defect in a small batch is higher and the time to react smaller. The same five categories apply; the response-time targets may be slightly more relaxed because responders may be supporting multiple lines.
How does V5 render andon on the kiosk?
As a single high-contrast tile pinned at the top of the kiosk. One tap raises the five-category picker; second tap fires the event. The acknowledgement + resolution flow lives in the responder's mobile view; the operator just sees a live status indicator (acknowledged → in-progress → resolved) without leaving their step.
Frequently asked questions
Q.Is andon mandated by any regulation?+
No regulation says "thou shalt have andon". But several regulated workflows — 21 CFR 211.192 production-record review, ISO 9001 §10.2 nonconformity control, ISO 13485 §8.3 nonconformity, OSHA accident-reporting timeliness — require timely capture and resolution of issues raised on the line. Digital andon is the cleanest evidence that the obligation is met.
Q.How many categories should we have?+
Five is the proven number — material, quality, equipment, safety, support / changeover. Adding more degrades response time more than it improves analysis. Keep the categories small, push the granularity into the root-cause picklist at resolution.
Q.What's a healthy MTTA?+
Under 2 minutes for QUALITY + EQUIPMENT, under 30 seconds for SAFETY, under 5 minutes for MATERIAL + SUPPORT. The trend matters more than the absolute number — drift upward is the first sign the culture is slipping.
Q.Does the line have to actually stop when andon is triggered?+
It depends on the criticality. In the original Toyota model the line stops on every pull. In modern plants with mixed criticality, MATERIAL + SUPPORT often do not stop the line (a parallel station continues), QUALITY + EQUIPMENT often do stop the affected cell, and SAFETY always stops the line. The decision is configured per category × line.
Q.Can andon replace a deviation system?+
No. Andon is the moment-of-truth signal; the deviation system is the regulated record + investigation + CAPA. Andon feeds deviation; it does not replace it. The auto-promote mechanism is what makes them work as one workflow.
Q.Does andon make sense for low-volume / high-mix production?+
Yes — arguably more, because the cost of a defect in a small batch is higher and the time to react smaller. The same five categories apply; the response-time targets may be slightly more relaxed because responders may be supporting multiple lines.
Q.How does V5 render andon on the kiosk?+
As a single high-contrast tile pinned at the top of the kiosk. One tap raises the five-category picker; second tap fires the event. The acknowledgement + resolution flow lives in the responder's mobile view; the operator just sees a live status indicator (acknowledged → in-progress → resolved) without leaving their step.
Primary sources
- Toyota Production System — Jidoka and the Andon cord (Toyota Global)
- ISO 22400-2 — Manufacturing KPIs (downtime, MTTR, availability)
- ISA-95 Part 3 — MOM activity groups (resource & dispatching coordination)
- ANSI Z535 — Safety colour / signal-word standards (visual signalling)
- OSHA 29 CFR 1910.145 — Specifications for accident-prevention signs and tags
- 21 CFR 211.192 — Production-record review (downtime / deviation linkage)
- Liker, J. — The Toyota Way (2nd ed., McGraw-Hill 2021) — jidoka chapter
Further reading
- OEEThe KPI andon downtime events feed.
- ISO 22400The KPI standard whose downtime / MTTR / MTBF definitions andon events populate.
- MESThe Level-3 platform digital andon runs inside.
- EWIThe kiosk surface where the andon tile lives.
- DeviationThe regulated record an andon escalation often promotes to.
- CAPAThe loop where repeated andon patterns become corrective action.
- Takt timeThe pace andon protects by surfacing problems instantly.
V5 Ultimate ships with the Andon controls already wired in — audit trail, e-signatures, validation evidence. Free trial, no credit card, onboard in days, not months.
