Dust and Stars - 1992 | Chapter 305 | Hashi and the Pages | English
The verification script finished its first run, taking forty-two minutes. The progress bar on the screen stopped at 91%. Lin Chen
Chapter 305: Hashi and the Pages
The verification script finished its first run, taking forty-two minutes.
The progress bar on the screen stopped at 91%. Lin Chen stared at the terminal output logs, his breathing light. The raw migration log file was a hexadecimal archive; after decompression, he extracted the SHA-256 verification strings. He had written a simple comparison function to match the hashes inside the archive line-by-line against the public fingerprints backed up three years ago by the IT department of the Provincial Second Hospital.
Match result: 98.7% consistent.
The remaining 1.3% corresponded to that batch of fourteen thousand records with completely overlapping timestamps. Lin Chen wasn't surprised. He had long anticipated that when the outsourcing team performed the full overwrite back then, in their rush to meet deadlines, they likely skipped verification on some non-core fields and directly rewrote the timestamps and primary keys. Technically, this was called a "dirty migration"; in audit terms, it was a "data integrity flaw."
He flexed his left foot. The numbness had receded above the ankle, replaced by a sore, aching tension from continuously contracting muscle fibers. The ibuprofen had peaked around 3 a.m. and was now slowly wearing off. He pulled open the drawer, felt for the pill bottle, glanced at the label reading "Do not exceed 4 doses in 24 hours," and screwed the cap back on. He couldn't rely on painkillers to mask nerve signals. He needed to stay clear-headed, knowing exactly where the pain was, so he could avoid postures that would trigger spasms.
He nudged his chair back half an inch, letting his left foot hang in the air, his heel resting lightly against the edge of the computer case's ventilation grille. Cold air blew in, slightly dulling the sharp pain. He created a new text file and named it AUDIT_TRAIL_MIGRATION_2019Q3.md.
No embellishment, no cover-up. He pasted in the raw output of the hash comparison, sampled screenshots of the timestamp collisions, and a summary of the outsourcing team's migration logs from back then. In the conclusion field, he typed: Historical data migration involved overwrite operations, causing temporal distortion in 1.3% of records. Isolation tags have been applied; core clinical metric traceability remains unaffected. It is recommended to cross-verify with paper archive catalogs during audit.
Compliance isn't about perfection; it's about transparency. He had written that sentence three times in his mistake notebook.
At 7:20 a.m., the glass door to the server room was pushed open. Su Man walked in, carrying the chill of the outdoors with her, holding two kraft paper file bags in her hands and a camera bag slung over her shoulder. She placed the file bags on the empty desk next to Lin Chen with a dull thud.
"Old Wang from the IT department would only give me these," Su Man said, pulling out a chair and sitting down, her voice slightly hoarse. "Paper medical record archive catalogs from 2016 to 2018, bound by department and month. Everything after 2019 was fully digitized; no paper backups were kept. I took high-resolution photos and sorted them by timestamp."
Lin Chen nodded without speaking. He took the file bags and opened them. Inside was a thick stack of printed paper, the edges already yellowed, carrying the musty smell and dust characteristic of old documents. He pulled out the first one: follow-up records from the Endocrinology Department in March 2016. The handwriting on the pages was in ballpoint pen, some of it already smudged.
"Start the cross-reference," he said.
For the next three hours, the server room was filled only with the clacking of keyboards and the rustling of turning pages. Lin Chen imported the photos locally, wrote a simple OCR preprocessing script to extract patient IDs, visit dates, and primary diagnoses from the paper catalogs, and then performed a Cartesian product match against the cleaned results in the database.
Match rate: 89.4%.
The remaining 10.6% was concentrated mainly in the second half of 2017 and the 2019 migration period. Lin Chen checked them one by one. Some were omissions in the paper catalogs, some were data entry errors in the electronic system, and others were simply due to doctors' illegible handwriting back then, causing the IT staff to read the wrong line when entering data. He encountered a record with "missing chief complaint but complete lab indicators"; the paper catalog only bore a blue "Archived" stamp, while the electronic system showed a string of garbled text. He didn't force a fix. Instead, he took a side-by-side screenshot of the raw byte snapshot of the garbled text and the photo of the blue stamp, tagging it ARCHIVE_MISMATCH.
"How do we handle this batch?" Su Man pointed to a group of red-flagged records on the screen. "The chief complaint and lab indicators don't match. The paper catalog says 'Hypertension Follow-up,' but the electronic system says 'Diabetes Initial Screening.' The timestamps are also from that colliding batch."
Lin Chen stared at the line, his fingers tapping lightly twice on the desk.
"Keep the raw snapshots," he said. "No forced corrections. Create a separate 'Data Ambiguity Sample Set' in the report, attaching the paper catalog photos and electronic system screenshots. Let the ethics committee judge for themselves."
Su Man glanced at him. "They'll reject it and send it back for revision."
"Rejection is their procedure," Lin Chen said, his voice flat. "We're submitting real-world data, not a standard answer key. Ambiguity itself is part of the data."
Su Man didn't argue further. She picked up her pen, crossed off a line on her notepad, and continued organizing the supplementary materials.
At 11 a.m., his phone vibrated on the desk. A WeChat message from Old Zhao.
"The Provincial Second Hospital's ethics preliminary review has passed. The Health Commission is pushing for progress. Your report must be in the system by 5 p.m. tomorrow. Don't delay."
Lin Chen stared at the screen, his thumb hovering over the keyboard. 5 p.m. tomorrow. Twelve hours ahead of the original schedule. Behind Old Zhao's urgency was the reality that competitors had already taken their positions, and the window of opportunity was narrowing. He replied: "Received. Draft report will be submitted for internal review by 8 p.m. tonight. The de-identification pipeline is running smoothly, and the anomaly queue has been isolated. Please simultaneously confirm the upload interface version for the Health Commission system."
Old Zhao didn't reply.
Lin Chen put down his phone and pulled his focus back to the screen. Time had compressed. He had to complete the final round of mapping verification, generate the PDF report, and package all raw logs and cross-verification paths before 8 p.m. tonight.
He pulled up the configuration file for DataAnonymizerV1 and changed the isolation strategy for the MIGRATION_ARTIFACT and PARSE_ERROR tags from "log only" to "independent archive." This meant the report size would increase, but the audit trail would be more complete. He hit Enter to recompile.
The compilation progress bar crawled upward slowly. Lin Chen's left foot began to twitch uncontrollably. He clenched his back teeth, pressed his right hand against his knee, and shifted all his body weight onto his right leg and the chair back. Cold sweat seeped from his temples, dripping onto the edge of the keyboard. He didn't wipe it away. He deliberately slowed his breathing, breaking the pain down into manageable rhythms.
"You don't look right," Su Man said, stopping her pen and turning to look at him.
"The medication wore off," Lin Chen said quietly. "It's fine. The pipeline is running; it's not consuming compute resources."
Su Man stood up, walked over to the water dispenser, filled a cup with warm water, and placed it beside his hand. "Drink it. Then go lie on the recliner and close your eyes for twenty minutes. You don't need to stare at the screen while the report generates."
Lin Chen picked up the cup and drank it slowly. The warm water traveled down his esophagus, bringing a slight warmth to his stomach. He leaned back against the chair and closed his eyes. In the darkness, the hum of the server fans, the read/write sounds of the hard drives, and the rustling of Su Man organizing papers were amplified. He didn't fall asleep; he just ran through the report's table of contents in his mind: Abstract, Data Source Description, De-identification Rules, Anomaly Handling Trail, Cross-verification Results, Audit Declaration.
Every chapter had to withstand scrutiny.
Twenty minutes later, he opened his eyes. Compilation complete. The log window displayed: ALL TASKS COMPLETED. REPORT DRAFT READY.
He sat up straight and opened the generated PDF. Three hundred twenty pages. Neatly formatted, clear charts and graphs, with the raw hash values and paper catalog cross-reference table for that batch of fourteen thousand colliding records resting in the appendix. No beautification, no omissions.
"First draft." He dragged the file into the internal shared directory. "Check the formatting. I'm preparing to package and upload."
Su Man opened the file and scrolled quickly. Her gaze paused on the "Data Ambiguity Sample Set" page for three seconds.
"It works," she said. "But the Health Commission's upload system has a file size limit. Anything over 500 MB will automatically compress images. Your paper catalog photos are high-res; they'll get blurred. If key information is lost, the audit will outright reject it."
Lin Chen's fingers paused.
"Convert to vector graphics?" he asked.
"No time," Su Man shook her head. "Before 8 p.m. tonight, you can only do one of two things: either trim the appendix, or rewrite the image compression algorithm to guarantee no loss of key information. The system interface is an old version; it doesn't support chunked uploads."
Lin Chen looked at the progress bar on the screen. Deleting the appendix meant breaking the audit trail. Rewriting the algorithm meant he would have to code a lossless compression module while pushing past the limits of his foot injury and exhaustion.
He pulled out his mistake notebook, flipped to a fresh page, and let his pen tip fall: Image compression. 500 MB threshold. Preserve key information. Legacy interface constraints.
"Not deleting," he said. "I'll write the script. You prepare the upload account."
Su Man looked at him for two seconds, then nodded. "Alright. I'll send you the interface documentation. Note that their verification logic uses byte-by-byte comparison; the archive format must match exactly."
Lin Chen placed his hands back on the keyboard. The cold light of the screen reflected on his face. His left foot still throbbed with pain, but he didn't move. He pulled up the low-level documentation for the image processing library and began deconstructing the compression logic. No reliance on off-the-shelf tools, only writing the core algorithm. Preserving high-frequency information in the lossless zone, applying lossy degradation to low-frequency backgrounds. Line by line, he coded it neatly.
Time was still running. The next step was compression.
More from WayDigital
Continue through other published articles from the same publisher.
Comments
0 public responses
All visitors can read comments. Sign in to join the discussion.
Log in to comment