What is Q▪Kernel?

Q▪Kernel is a Real Time Operating System, or Kernel, specifically developed for a new generation of processors like the Microchip 16-bit and 32-bit processors. Q▪Kernel fully exploits the capabilities of those processors by implementing its unique micro kernel segmented interrupt architecture, making it the fastest RTOS. The architecture enables dual mode capabilities not found in any other Real Time Operating System.

For a more detailed description of the features, please download the documents below.

Q▪Kernel User Guide

Feature Guide

Q▪Kernel Reference Guide



Dual-Mode RTOS

A new breed of embedded applications is rapidly evolving. Traditional DSP applications are adding networking and other control functionality. But at the same time, the typical MCU control application will often include High Dataflow requirements like streaming media and other DSP functions. An emerging solution for this new class of 'hybrid' application is the convergent processor. This design approach combines both DSP and RISC/microcontroller capabilities into a single, unified architecture.

A processor that implements this architecture can operate as a DSP engine, be totally dedicated to a control application, or can operate somewhere in between. This makes those processors suitable for everything from industrial control to portable devices. Single convergent processors are an attractive alternative to the larger and more costly RISC and DSP processors. The dsPIC and PIC32MX from Microchip are convergent processors that can replace the RISC and DSP processors.

While Microchip implemented the hardware for convergent processing, the software is often lacking. A traditional multithreading RTOS adds enormous overhead to the DSP portion of the application. While a simple scheduler may work fine for the DSP or High Dataflow part of the application, it is not a good solution for a control application.

The traditional RTOS requires a stack for every thread, so it can block while waiting for an event and can be preempted by a higher priority thread. The switching between threads, known as context switching, is an expensive operation especially for processors with lots of registers like the dsPIC/PIC24 (20) and PIC32 (40). The larger the context, the longer it takes to switch the context.

DSP and High Dataflow applications typically read a block of data, operate an algorithm on the data, and then send the data to another programming unit for further processing. Due to the real time nature of the data, the algorithm must start within a very tight window once the data becomes available. Developers often design their own custom executive that handles the High Dataflow based on a corporative scheduling model. To combine this model with a traditional RTOS they run the algorithms in a high priority thread which adds a lot of overhead.

The Dual-Mode RTOS combines the traditional thread-based kernel architecture for real-time control processing with specialized fibers for High Dataflow operations. The architecture accommodates the different needs for both domains by separating them. Q▪Kernel enables both types of application code to run fully optimized on a single processor, and both fibers and threads use a common API.

In order to meet real-time requirements, the DSP and High Dataflow processes run as fibers, at a priority that is higher than control threads which ensures they get access to the CPU. These fibers are lightweight because they have no context, making the switch from fiber to fiber very fast. Furthermore, fibers run at a priority just below that of interrupt handlers, a position that tends to reduce startup latency and minimize jitter.

Low Power and Tick-Less

Many embedded applications spend most of their time waiting for an event to happen – a touch on a panel, incoming communication or wait for a time delay. In many applications the processor is only active for a small amount of time and battery life can be extended significantly by placing the processor into idle or sleep mode. Maximizing idle or sleep time, and minimizing active time is the key to extending battery life.

This means that state machines with looping and polling don’t work well, but interrupt driven applications based on an RTOS do. While an RTOS is a much better solution, it has one disadvantage. Most RTOS require a tick for time management. This causes the processor to be frequently activated which consumes additional power. The shorter the tick-time, the more power is consumed. Applications often require a short tick-time for finer timing. A Tick-Less RTOS solves this problem.

Q▪Kernel is tickles and eliminates polling completely. It also optimizes power saving by splitting the timing into a human time scale (1 second to >30 years) and a processor time scale (1 µSec to 10 Sec). The human time scale uses the RTCC or the 32 KHz timer that is available in sleep mode and provides more power saving. The processor time scale provides a wait time with a granularity of 1 µSec. When there is an outstanding short time request, the processor can be switched to idle mode. If there is no outstanding short time request, the system will stop the timer and switch to sleep mode. This means that power consumption is always minimized while the system is waiting for user activity.

Integrated Power Management

Low power consumption is a deciding factor in many embedded applications. More and more applications like medical devices, wireless communications and personal devices demand low power consumption for a better battery life. Most hardware vendors, like Microchip, have equipped their processors with power saving features.

Most processors, including the 16 and 32 bit chips from Microchip, provide several power saving modes. Q▪Kernel contains integrated power management that makes it simple to lower power consumption. When the RTOS is idle, it signals the application and provides the best power saving mode. The application has the ability to disable additional hardware and instruct the RTOS to select the required power mode. The power management module will select this mode without suffering from race conditions. Q▪Kernel handles all race conditions and makes the implementation simple and flexible.

Threads and Fibers

Q▪Kernel combines traditional thread-based kernel architecture for real-time control processing and specialized fibers for High Dataflow operations.

In order to meet real-time requirements, High Dataflow processes run as fibers, at a priority that is higher than control threads which ensures they get access to the CPU. These fibers are lightweight because they have no context, making the switch from fiber to fiber very fast. Furthermore, fibers run at a priority just below that of interrupt handlers, a position that tends to reduce startup latency and minimize jitter.

Never Disables Interrupts

Competitive systems are designed to disable interrupts during critical operations. Q▪Kernel is designed to never disable interrupts. The “Segmented Interrupt” architecture makes losing interrupts a thing of the past. This unique architecture will never disable interrupts and the combination with fibers makes interrupt handling very fast. As a result Q▪Kernel does not add a single cycle to the interrupt latency and facilitates seamless communication between Interrupt Service Routines, threads and fibers. It can handle very high interrupt rates. In the Thread metrics interrupt performance test an interrupt signals a waiting thread. A PIC24H running at 40 MHz can handle more than 530,000 of those tests per second. Please see the performance results here.

The interrupt processing prevents Interrupt jitter. Reducing the jitter is important for Real Time Operating Systems, since they must maintain a guarantee that execution of specific code will complete within an agreed amount of time. Only “Segmented Interrupt” architectures are able to guarantee jitter free operation.

Zero Interrupt Latency and No Interrupt Jitter

Interrupt latency is the time between an interrupt request and the execution of the first instruction of the Interrupt Service Routine. The interrupt latency is the sum of a lot of different smaller delays explained below.

The first delay is typically in the hardware. Modern hardware architectures use instructions that are single cycle or double cycle. Hardware typically introduces 4 to 8 cycles of fixed interrupt latency.

The second delay is related to the RTOS and the fact that interrupts are disabled. Most RTOS are based on the Unified Interrupt Architecture temporarily disable the interrupts to protect critical sections including thread switching. This contributes to longer interrupt latency.

Q▪Kernel has a segmented interrupt architecture that never disables interrupts, but will postpone communication with the RTOS when a critical section is detected. This approach solves the interrupt jitter and guarantees zero interrupt latency. Jitter is the variation in interrupt latency. Disabling interrupts always leads to interrupt jitter.

Hard Real-Time

A hard real-time system, also called deterministic or temporal correct, is a system that requires a guaranteed response to specific events within a defined time period. The failure of a hard real-time system to meet these requirements typically results in a severe failure of the system.

The correctness of an operation depends not only upon its logical truth, but also upon the time in which it is performed. The system can be seen as a software extension of the hardware ISR mechanism and guarantees that the highest priority thread that can run shall run. To optimize the deterministic behavior of the system Q▪Kernel implements the following mechanisms:

  • Zero interrupt latency
  • Interrupts are never disabled
  • Memory allocation and de-allocation is deterministic

This approach delegates the timing of the application Q▪Kernel, so the developer can focus on the behavior of the system.

A Hard Real Time RTOS like Q▪Kernel supports the developer to create a dependable and stable system. If the application is tested it remains working since the timing will always be the same.

Memory Management

Most Real Time Operating Systems provide the developer with a simple fixed-size-blocks memory allocation algorithm. This works well but is inflexible because the developer needs to know the memory requirements in advance (requirements include size). Q▪Kernel support three memory allocation mechanisms so the developer can always use the optimal mechanism. The memory allocation mechanisms are not only versatile but also much faster compared to any competitor.

  • “Allocate only” Heap In embedded systems, many blocks are permanently allocated at startup. The heap works well because each block can be exactly the right size. Fragmentation is not a problem because these blocks are never released. This algorithm is fast and deterministic.
  • Fixed Memory Blocks Q▪Kernel also implements the traditional fixed-size-blocks memory allocation algorithm but extra facility to allocate and de-allocate from interrupt handlers was added.
  • Dynamic Memory Blocks The Dynamic Memory Block algorithm manages blocks of every size and functions like malloc() and free(). The algorithm is fast and reasonably deterministic. There is no external fragmentation.

Statistics and Thread & Fiber Tracking

Q▪Kernel provides detailed statistics per thread and system overhead. It provides a very accurate picture of how much CPU time is used per thread by Q▪Kernel, and how much free CPU time there is.

While statistics are very useful they change the timing of the application. Fiber and thread tracking is non-intrusive and does not change the timing. Tracking makes it possible to follow the application timing in detail with a logic analyzer including individual threads, priority and queued fibers and the Q▪Kernel scheduler.

Another form of tracking is the switch notification. A user function will be called when the system switches threads. The developer can write code to find problems or can code extensive tracking mechanisms for debugging purposes.

Real-time Clock, Alarm and Timer functions

Q▪Kernel provides a software real time clock for human timing requirements and timer for processor scale timing. Because Q▪Kernel is tick-less it provides timing functions with a granularity of 1 µSecond to 30 years. The RTCC or a 32 KHz crystal with TMR1 can be used to create the software real-time clock or it can be emulated from the processor clock.

Both the RTCC and processor clock are completely disconnected from threads which makes it very versatile. Expired timers or RTCC alarms run as fibers and use the interrupt stack. Because timers are managed objects they can be created, deleted and opened. Re-using timers is possible.

Some competing products set event flags. This almost always results in a context switch, because a thread is waiting for that event. In a lot of cases this is not necessary because all the work can be done without context switched.

Q▪Kernel also provides integration with the timing functions of the TCP/IP stack for easy integration.

Centralized Error Handling

Centralized error handling minimizes code complexity. When a Q▪Kernel function returns, the developer knows that the function ended successfully so it is not necessary to test the result. When an error occurs, control resumes at the central error handler instead of being passed back to the caller.

Central error handling is very efficient during debugging because the developer only has to set one breakpoint for all fatal errors. All errors have a unique code so the developer knows immediately what went wrong. The system can also be used for application errors.

Errors can occur in interrupts and in other places where it is impossible to log them directly and the system might be instable due to the error. Q▪Kernel has a special feature that makes it possible to log the errors after a reset.

Q▪Kernel is one of the few products that uses the stack overflow detection of the processor when it is available. This means that undetected stack overflow is a thing of the past and problems can be detected early on. Stack overflow and other exceptions are integrated in the centralized error handling.

Best Performance

Q▪Kernel delivers the best overall performance of all competitors according to the Thread-Metric benchmarks test suite produced by Express-Logic Inc.

Performance of an RTOS is very important because all timing and context switches are done by the RTOS. The reaction time of an application is not determined by the specific application code but by the RTOS.

The performance results of Q▪Kernel shows that it is the most functional and complete RTOS available. Please see the performance results here


Publish/subscribe (or pub/sub) is a messaging pattern where senders (publishers) are unaware of specific receivers (subscribers). Subscribers express interest in one or more messages, and only receive messages that are of interest, without knowledge of what, if any, publishers there are.

This decoupling of publishers and subscribers is called loosely coupling and has a number of advantages.

  • A change in one module does not force a ripple-effect of changes in other modules. Developing and maintaining software requires less effort and time due to the decreased inter-module dependency.
  • It will be easier to reuse software because there are fewer dependencies. A software module will be easier to test because dependent modules do not need to be included.
  • No changes are required if the number of subscribers or publishers changes.
  • This pattern promotes agility because a change in the application does not require that all software modules have to be changed and re-tested.

The Q▪Kernel implementation of pub/sub is very simple to implement and provides three delivery mechanisms, functions, pipes and queues.

Buy NOW!

Purchase Q▪Kernel-Pro Licenses and/or Pro support.

Download the latest release of the Q▪Kernel source code and documentation.


Get the answers to some common questions about Q▪Kernel

Email Support

For Q▪Kernel-Pro support contact us.