comtech onsite Flashcards
(19 cards)
How did you handle error handling with your metal binder jetting project?
Error handling was done through code. We had data that was batched and pushed over Modbus TCP to an edge node—basically an Ubuntu server that acted as the staging layer.
The edge node received the batches via a simple REST API (Flask) we set up, and then performed data validation, timestamp normalization, and deduplication.
For example, if a Pi re-sent buffered data, the code running on the edge node would detect and skip duplicates using a hash of timestamp + sensor ID + value
How do you, as a Software Engineer, typically approach error handling and defensive coding?
My approach to error handling is to ensure that can handle them gracefully, and ensure that the debugging and maintanence work can be as easy as possible.
The first thing I consider before any error happens, and that is making my code modular and easy to change. For example, if there is an error that occurs with how we transform data before loading, I should be able to make changes there and not anywhere else.
Second, when an error occurs, I ensure that the process can continue if it is still possible. This can be done through retrying or adaptive processing.
Lastly, documentation and visibility is critical. It is important to be able to understand the code, and to get visibility into how the code executes through logging at various levels: DEBUG, INFO, WARNING, etc…
This is what will help us identify how the error happens and how it can be replicated.
When it came to printer data, Did you ever face situations where sensor readings were erratic? How did you handle outliers or calibration?
Yes, we occasionally saw erratic values.
In our case, we implemented a primitive identification method through SQL code.
If a value deviates from the previous by more than a given threshold, a TRUE value is filled into a deviation column in the final table. This allows us to at least identify where deviations started when we look at the time series.
When it came to printer data, What kind of queries or KPIs did researchers rely on most from this data?
Researchers primarily used the telemetry to correlate environmental stability with print quality.
Common KPIs included average and peak internal temperature variance, humidity fluctuation, and airflow consistency across time intervals.
Why did you choose Modbus TCP for networking your test rig to the edge node?
We chose Modbus TCP because the PLC and edge node were both on the same internal Ethernet subnet.
Modbus TCP provided polling and was easy to implement using Python libraries like **pymodbus**
.
How did you handle Modbus register mapping and decoding?
The PLC exposed specific holding registers with offsets for state codes and timestamps.
I worked with the engineering lead to define a shared register map.
On the edge node, my Python script used pymodbus.client
to poll those registers, decode values into timestamps or state flags, and translate those into structured logs for our test cycles
Tell me about a project where you worked with Python in a performance-critical system. How did you ensure efficiency and stability?
At Bass Pro Shops, I worked on a project that involved processing high-volume transactional data.
The core challenge was that our transaction historization system—used to track and update sales records—was taking over 3 hours to run and often failed under peak loads, directly impacting reporting and business decisions.
I tackled this by first documenting the entire process to understand the process, and using a context manager to identify performance bottlenecks through execution time.
I found that row comparisons were being done inefficiently, and the data transformation library in use was memory-heavy and poorly suited to scale.
To address this, I rewrote the historization logic using hashing to enable fast, one-to-one row comparisons.
I introduced partitioning strategies to divide the data by key dimensions, which drastically reduced the size of the working dataset in each operation.
I also replaced the legacy transformation library and used with a modern Python library that reduced memory usage and improved compute efficiency.
As a result, I brought down processing time from over 3 hours to just 15 minutes—consistently.
- Have you had any experience working on Linux-based embedded systems or devices? Can you describe it?
Yes, while with the NRC I used Raspberry Pis running Linux to collect environmental telemetry—temperature, humidity, and internal airflow—using sensors.
These Pis polled data at intervals, batched it, and transmitted it over Ethernet to an edge node to load into to a PostgreSQL database.
On the software side, I wrote Python scripts to manage sensor polling, and batching logic. I also implemented system monitoring to log uptime and resource usage.
Explain Ethernet and Modbus
Ethernet is the infrastructure, the physical and network connection between the PLC and the edge node..
Modbus is the protocol used for communication, how data is structured and interpreted.
It is the protocol that runs on top of Ethernet. It defines how the PLC and the edge node structure their messages to read from or write to registers, coils, or memory locations in the PLC.
- Have you worked with any IoT protocols like MQTT or Modbus? If not, how comfortable would you be picking these up quickly?
Yes, I’ve worked directly with Modbus during my time at the National Research Council.
On our thermal test bench project, we used a Siemens PLC to monitor thermal cycling of the test rig.
I set up an edge node to poll the PLC using Modbus, retrieving real-time data on cycle timing. This powered our error detection system and triggered alerts if thresholds were breached.
While I haven’t used MQTT specifically, I’m confident in my ability to learn it quickly.
My experience working with Modbus—and integrating data from embedded systems into databases has given me a strong foundation in IoT communication patterns.
You mentioned .NET desktop development. Do you have any experience with modern front-end frameworks like React or Vue?
While my core experience with UI development has been in .NET—particularly using WinUI for building a desktop training application at Public Safety Canada—I’ve also explored modern front-end frameworks like React.
I haven’t used React in a production environment, but I know of its component-based architecture, and how it fits into full-stack workflows.
I’m comfortable picking up new frameworks quickly and already have experience integrating front ends with back-end systems.
Given my track record adapting to technical stacks, I’m confident I could ramp up on React or Vue and contribute meaningfully to any front-end effort.
- Do you have experience deploying applications to cloud platforms like AWS, Azure, or GCP? If not, what do you know about infrastructure as code or containers?
Yes, I have experience deploying and managing data applications on Google Cloud Platform (GCP). At Bass Pro, I redesigned our ETL infrastructure by moving us off a Windows-based virtual desktop setup and into a modern cloud environment using GCP, Airflow, and Snowflake. I deployed Apache Airflow on two Linux-based GCP VMs, configuring them for production and development orchestration. I also migrated raw file storage into GCP Cloud Storage buckets and used Snowflake as our cloud data warehouse.
While I didn’t use tools like Terraform in this specific role, I managed cloud resources manually through infrastructure configuration and scripting on Linux. I’m also familiar with the principles of infrastructure as code, containerization with Docker, and am comfortable learning tools like Terraform or Kubernetes if needed.
This project significantly improved system stability, enabled faster deployments, and gave me hands-on experience working with cloud-native technologies in a production setting.
What is MQTT?
MQTT (Message Queuing Telemetry Transport) is a publish/subscribe messaging protocol.
A publisher sends some kind of data to a topic (sensor_data)
A subscriber will subscribe to the topic,
and an intermediary (broker) will deliver the data.
Describe the process of a CI/CD pipeline
Configure a YAML configuration file for your repo and setup Actions.
When you push code or open a merge request, the pipeline runs automatically using that configuration.
It checks out your code, installs dependencies, and runs your tests.
If all steps succeed, your code is considered ready for deployment.
What is containerization?
Containerization is a technology that packages an application and all its dependencies into a container.
Containers ensure that the application runs the same way regardless of where it is deployed, eliminating issues caused by differences in environments.
With containers, deployments are managed using tools like Kubernetes.
Containers are the running application of the container image.
-
Public Safety Canada – Python Automation
- “You built a rule-based scoring system in Python—what kind of data inputs did it consume, and how were the rules maintained or updated?”
The library was used with reports from an internal database, including metadata fields like category types, source reliability, and contextual tags.
The scoring was driven rules for high-confidence signals—such as blacklisted keywords or known affiliations. Statistical thresholds were used to** combat inaccurate rule scoring, usually from spam and obfuscation.**
Rules were defined in a configuration layer, separate from the logic, so they could be easily maintained or tuned by non-developers.
Can you walk me through the triage workflow and how you structured your MongoDB schema for that purpose?
At Public Safety Canada, I built an automated triage system to reduce the manual workload analysts faced when reviewing unstructured intelligence reports.
The core of the system was a Python-based pipeline backed by MongoDB, which gave us the flexibility we needed to handle highly variable report formats.
Each document in MongoDB represented a single report and was structured with key extracted fields—like report_id
, source
, and confidence_score
During ingestion, reports were parsed to pull out relevant fields.
I then applied filters and a scoring layer to flag high-priority reports for review.
I also built a SQL layer on top of the structured outputs for reporting and auditing.
This hybrid Mongo + SQL approach gave us both the schema flexibility needed for upstream ingestion and the rigid structure needed downstream.
The result was a faster, more reliable triage process that allowed analysts to focus on the highest-value intelligence.
Can you describe what an embedded system is, and how development in that environment differs from general-purpose software?
An embedded system is a specialized computer system designed to perform a dedicated function or set of functions, often as part of a larger device.
Embedded system development is more hardware-focused, resource-constrained, and often requires real-time operation, making it quite different from general-purpose software development.
Given a Linux-based embedded system with no GUI, how would you go about debugging a service that’s crashing during startup?
Check System Logs and check the stack trace, if possible use a debugger to run through execution line by line.
Make sure the system isn’t running out of memory, file handles, or other resources.