AVR Optimization: Introduction to Register-Level (Mid-Level) Coding |
Updated for 2021/2022. Your comprehensive CHUMP journey has prepared you for the second half of your ICS4U experience: the deepest possible drilling into the hardware and software of modern microcontroller technology. The carefully designed sequence of register-level coding exercises below, exploiting familiar ACES' optical devices, is the final preparation for the final push in the Spring: Pure Assembly Language Programming of the AVR ATtiny84 on the Dolgin Development Platform (DDP).
Although these C coding exercises can be considered high-level, they call for a deeper understanding of the underlying architecture of the AVR family of MCUs. For now, this is the familiar ATmega328P, the MCU of choice for the UNO, Nano and other Arduino development boards. I often refer to this coding as mid-level, as it lies in-between high-level C (that hides much of the functionality and potential optimization) and low-level Assembly language (that you've had a brief introduction to in your 7 CHUMP instrucitons). The magic behind functions such as pinMode(), digitalWrite(), digitalRead(), and shiftOut() for example, are exposed and rewritten, in some cases, more efficiently. Direct port bit manipulation of the MCUs' General Purpose Input/Output (GPIO) Registers are called for to achieve improved performance. Conveniently, your CHUMP experience has given you the necessary insights that were lacking at the end of Grade 11 to guide these next steps.
ATmega328P GPIO Register Reference
Register-Level Exercises
You may wish to create a folder entitled RegisterLevelCoding or something along those lines, to house this unique set of sketches I am asking you to develop.
References
The Arduino Language Reference web page includes links to resources in support of bit manipulation.
Two of these categories include the Bitwise Binary Operators and convenient Bitwise Functions that employ these and other operators to facilitate binary manipulation (aka bit-banging). It serves our immediate purposes with respect to knowledge and skill to code the bodies of these bitwise functions to produce the same result, ourselves.
Task.
B. General Purpose Input/Output (GPIO) Tasks.
Note. It's not how many lines of code that matters, it's often how few.
With with the awareness that the AVR's GPIO registers are accessible by name to your Arduino C sketches, develop the most efficient register-level statements you can to achieve each of the following tasks on your UNO or Nano. The employment of register-level coding strategies precludes the use of high-level statements such as pinMode(), digitalRead(), or digitalWrite(), etc.
Tasks
Task.
To be developed...
AVR Optimization: Create a Custom Library (Mega328P.h)
You've employed a number of libraries over the past year or so, from the Arduino Core libraries (transparently included in every project) to third party libraries (through the explicit use of the #include directive) that hide the functionality of complex devices and tasks through convenient object instantiation and subsequent member function calls. Well, it's time to establish and exploit your own custom library. For the first attempt, you'll pack useful ATmega328P resources (constants, data structures, functions, etc.) of your own design and selection. You'll gain some familiarity with the process from completing the directed exercises below. In doing so, future projects can reuse and enhance this code to yield the desired outcomes in less time, and with improved confidence.
Task 1. Separation of Code. First, let's consider some code that is generic enough to be reusable in future projects and confirm that it can exist in a separate file.
Task 2. Centralized Access to Common Code. Now, we should place your library (the Mega328P.h file) in a folder that is accessible to all future projects.
Task 3. Syntax Highlighting: keywords.txt. Syntax highlightig is a useful feature that enhances code development. The key to ensuring the personal assets of our custom libraries are supported by the Arduino IDE's conventional syntax highligting scheme is a separate text file called keywords.txt.
With bit-level coding of the ATmega328P's Digital IO registers (0x0020-0x005F) under your belt, it's time to excavate the next level of SRAM Addresses.
Review the ATmega328P's SRAM Register File one more time.
Here's a question for you. Based solely on the undefined (grayed out) addresses of the 328P's digital IO ports, what ports might you expect to find if you examined the ATmega2560's Register File?
Interrupts: The to Software Optimization
Our ACES program places a high premium on productivity. Since organization and efficiency drive productivity, it is necessary to dig deeper into the architecture of the AVR microcontroller to familiarize ourselves with its capabilities in support of our code strategies and structures.
It's pretty obvious that watching your phone constantly for incoming text messages is not a productive use of your time. Likewise, standing at your front door waiting for that Amazon delivery is also a colossal waste of time. Finally, you could probably bring in the green bins from the curb rather than watching and waiting for the kettle to boil up your next cup of tea. The good news is that most of these 'Events' come with associated 'Alerts'. Today's microntrollers are no different and the AVR family is no exception. For example, the analogRead() function initiates an analog-to-digital conversion process that takes time. To the human this takes virtually no time at all but, to the MCU, these are precious clock cycles that could be used to perform a parallel task while the conversion is waiting to complete. So, wouldn't it be terrific to simply launch an analog-to-digital conversion, then do something else and wait to be alerted (ie. interrupted) that the conversion is complete and ready for reading? This sounds highly productive. The AVR line of MCUs has a builtin architecture for alerting or interrupting your code when it finishes certain tasks. We need to exploit this.
The most efficient software for the embedded developer is typically none at all. This may sound flippant but if the hardware platform your developing on has native circuitry for a given task, there is (likely) no software routine that will perform more efficiently. Review the ATmega328P's Interrupt Vector Table (IVT) to the right.
This table suggests there are as many as 26 different sources of events on the ATmega328P that have the ability to complete the following sequence,
The order in which these Interrupts are listed carries with it additional meaning. Implied in this order is a sense of priority. By being #1, a hard (or soft) Reset has the highest prioritiy and takes precedence over any other task your MCU is performing, resets the Program Counter to 0, and code starts from the beginning again. We'll start our formal implementation of Interrupt coding with the Reset Interrupt.
You are familiar with the effect of pressing the Reset button on the UNO or Nano, however further optimization is accessible through a deeper understanding of the subtleties the various Reset sources of AVR MCUs. These sources can be detected and subsequently impact how your code chooses to restart.
Chapter 11 of the ATmega328P datasheet introduces the concept of System Control and, in particular, the Reset Function. The MCU offers four hardware Reset triggers that include: Power On, External, BrownOut, and Watchdog. At the end of this chapter, there is a description of the Registers that can be inspected and manipulated. For our purposes, we'll limit ourselves to a simple distinction between the Power On and External Reset triggers. Page 54 of the ATmega328P datasheet presents a detailed discussion of the MCUSR Register, (see: ATmega328P Register Summary) in which the source of a Reset can be identified through the use of flags (bits).
Task.
Interrupt #2 and #3. External Interrupt Requests 0 and 1 (INT0 and INT1)
From your previous year's experience you are aware that MCUs have digital pins that can be configured to respond to changing voltage levels presented by external behaviour. By way of example, the ATtiny84 has ONE such pin, the ATmega328P has TWO, and the ATmega2560 has SIX!
The two on the ATmega328P are referred to as INT0 and INT1 (digital pins 2 and 3 respectively). As can be seen from the IVT (below, left), each has a separate vector address and immediately follow the System Reset Interrupt in order of priority. An excerpt from Chapter 13 of the ATmega328P datasheet appears below right. Configured correctly, the ATmega328P can sense and respond to four voltage level behaviours presented on the INT0 and INT 1 pins.
ATmega328P Interrupt Vector Table | ATmega328P External Interrupt Registers |
---|---|
Task.
Interrupt #4, #5, and #6. Pin Change Interrupt Requests 0, 1, and 2
for 2022. 20/21 ICS4U ACES were the first class to be introduced to the ACES Rotary Encoder. A few short months later A. Goldman and S. Atkinson used their familiarity with the device to support their Clue Capturer prototype for their ECE190 course at Waterloo. Their achievement is the inspiration behind this project segment of your journey.
Your Bourns PEC11L-4215F-S0015 Rotary Encoder offers three sources for useful interrupts. These are A, B, and SW. Given that the ATmega328P only provides two external interrupts, we look beyond INT0 and INT1.
In reading Chapter 13 of the ATmega328P datasheet you discover that interrupts can be triggered on ANY digital IO pin in the event that a logical level change has been detected in hardware. A little bit of software detective work is required to determine which pin was triggered as the 24 pins (PCINT0..PCTIN23) are organized into three separate banks defined by the respective Ports. Pins on PortB are mapped to PCINT0..7, pins on PortC are mapped to PCINT8..15, and pins on PortD are mapped to PCINT16..23.
A sample of the applicable Pin Change Interrupt registers of the ATmega328P appears below.
Task.
Interrupt #7. Watchdog Time-Out Interrupt (with and without a System Reset)
The watchdog timer (WDT) is an MCU subsystem that can be configured to perform scheduled functionality. From the ATmega328P datasheet, 'The watchdog timer is clocked from an on-chip oscillator which runs at 128kHz. By controlling the watchdog timer prescaler, the watchdog reset interval can be adjusted as shown in Table 11-2 on page 55."
The WDTON fuse determines whether the WDT subsystem is enabled or not (see AVR FUSE Calculator). The default setting has it enabled (WDTON=0). As with any MCU susbsystem, should it not be required, advanced applications may wish to disable the feature in order to conserve power.
WDT Tasks. (Three Variations)
From Table 11-1 to the right it can be seen that there are FIVE courses of action that that can be taken following the end of an elapsed WatchDog Timer interval. We'll start with a simple Interrupt.
As alternative to the runaway code context above, it is not difficult to imagine a remotely-situated, solar-powered, MCU-based sensor device that collects (and possibly transmits) data to a centralized location. Understandably, optimum use of power would contribute to its viability and sustainability. This segment provides some insights into how some aspects of such a prototype could be implemented to conserve power between readings. One video I encourage you to watch is Kevin Darrah's Low Power Arduino! Deep Sleep Tutorial. This presentation contains applicable material for this investigation.
ATtiny84 GPIO Register Reference
Register-Level Coding of the ATtiny84
Task.
3. Mid-Level ATtiny84 Makeover of the Classic I/O Functions
This is a large pool we're wading into, so let's start in the shallow end. Some of the first Arduino functions you were introduced to were pinMode(), digitalWrite(), digitalRead(), and shiftOut(). Those simple blinking LEDs provided early, but important, confirmation that your software and hardware efforts were meshing, and on the right track. The only problem was that there was too much 'behind the curtain' magic that, over a year later, we need to explore in order to optimize our future engineering objectives.
Task.
These are some of the familiar ACES optical devices we'll apply our nascent mid-level coding practices to ...
Your Session 4 loot bag includes two 3mm bicolor LEDs (red/green). Place its two leads in PA6-7, with the longer lead in PA6.
Create the project BicolorAlternation that employs direct, register-level GPIO Port manipulation resulting in red/green alternate flashing every half second.
6. PlayByte. 5mm Red/Blue Bicolor LED.
Your Session 4 loot bag also includes a 5mm bicolor LED (red/blue). Place its three leads in PA0-2, with the second longest lead in PA0.
Create the project PlayByte that employs direct, register-level GPIO Port manipulation resulting in a continuous, 8-frame animated sequence that 'plays' a byte in an LSBFIRST order. The code is to be Red for 0 and Blue for 1. Each colour is to be held for 1 second, with no delay or gap in between. At the end of the 8-bit colour sequence, turn the LEDs off for 3 seconds before repeating, indefinitely.
Example. The image (above right) reflects the playing of B10011101 (or 0x9D or 0235 or 157).
7. ASCII Alphabet on a 7-Segment Display.
Your ICS3U experience introduced you to the concept of a segment map (array) for an ASCII letter lookup table (LuT). The code can be found at AVRFoundations: Write7SegUpperCaseCharacters.ino.
Obtain the array code for the segment map and drop it into a new project entitled ASCIIAlphabet. Develop register-level code that makes optimum use and efficiency of the ATtiny84's ports to continuously cycle through the LuT, displaying the letter representation on your 7-segment display device.
8. 10-LED Bargraph.
(The ATtiny84 is just that, tiny, so this exercise may push the limits of the MCUs GPIOs). Create the project BargraphAnimation that uses a 1:1 GPIO port pin to LED mapping to duplicate the animation depicted in the image to the right. Strive to make it as efficient as possible.
9. Register-Level 74HC595 Shiftout (Morland Bargraph).
The previous exercise left little GPIO room for anything else. We need to drive a 10-LED bargraph with fewer port pins, so we return to the 74HC595 shift register. However, in this iteration, we forego the use of the core shiftOut() function in favour of our own register-level bit manipulation strategy.
First, review the primer below on the 75HC595.
Create the project ShiftOutAnimation that duplicates the result of the previous exercise but only requires three ATtiny84 GPIO pins, as opposed to ten.
10. 64-LED 2D Matrix.
11. ACES' CharlieStick. TBC.
12. ACES CharlieMatrix. TBC.
A shift register is a device typically used to expand the number of pins of a microcontroller. The most common design by far is based on the 74HC595 architecture. Using (as few as) three pins of the MCU (coloured) you are able to control 8 (or more) data lines (QA..QH). In addition to the three pins that control shifting, two other pins (Output Enable and Master Reset) provide additional (Active Low) control over the state of the 8-stage internal register set and output latches.
The IC accepts bits serially and presents them on output pins in parallel.
Two sets of naming conventions for pins tend to confuse those new to the IC. The image below tries to address this issue.
2. Waveform Timing Details. The digital orchestration of serial input bit stream is summarized in a waveform or timing diagram. A more detailed explanation of the mechanics of Serial to Parallel Shifting-Out with the 74HC595 can be found by following the link to the Arduino tutorial. Let's attempt to decipher the timing diagram, extracted from the 74HC595 datasheet,
As you are well aware, the immediate benefit of the Arduino shiftOut function is to hide the details of the digital dance allowing the higher-level programmer to concentrate on more macro concepts. However, hiding details always comes at a cost; if not performance then, at the least, in understanding. Our goal is to solidify our lower-level coding skills through direct handling of the signals on the three control pins, thereby bringing you closer to the AVR and IC hardware. This brings all sorts of future dividends.
3. Prototyping Platform. Insert your Morland Bargraph into the DDP in the most feature-suitable position (supply and PWM) as shown to the right.
4. Register-Level shiftout. With the signals and waveforms of the 74HC595's timing diagram understood, we can now tackle the low-level responsibilities of the waveform ourselves through direct register manipulation. Remember, our goal is to enlighten, not suggest this is the preferred alternative in all cases.
shiftOut
header function. We'll use a purely lowercase shiftout
to avoid compiler confusion,
// LSBFIRST:0 MSBFIRST:1Complete the body of the function, hard-coding the port manipulation for this platform.
void shiftout(uint8_t order, uint8_t value)
loop()
function that exercises your shiftout() function to display an interesting pattern on the bargraph.