Bering-uClibc 5.x - Developer Guide - Adding a Hardware Architecture Variant

From bering-uClibc
Revision as of 14:52, 31 March 2012 by Davidmbrooke (Talk | contribs) (uClibc Configuration File: Added note on (removed) multithreaded "make oldconfig")

Jump to: navigation, search
Adding a Hardware Architecture Variant
Prev Bering-uClibc 5.x - Developer Guide


Introduction

A major enhancement added in Bering-uClibc 5.x is the ability to target non-x86 runtime platforms. In principle it would now be possible to build Bering-uClibc 5.x for SPARC, MIPS or other CPU architectures. These notes provide guidance on what changes are required to add support for a brand new target architecture variant. The addition of support for the ARM11 processor on the Raspberry Pi single board computer is used as an example.

The first step is to understand exactly what hardware the target platform consists of. In particular:

  • What is the model number of the CPU?
  • What is the "architecture" of the CPU?
    • The ARM1176JZF-S CPU implements the ARMv6 architecture standard
    • This appears to fit within the Broadcom "BCMRING" "system on a chip" family.
      • Not completely sure about this yet, but it seems the most likely candidate Davidmbrooke 14:30, 24 March 2012 (UTC)


Note: These notes are intended to provide guidelines rather than a fully prescriptive recipe to follow.

Warning: The cross-compilation build system is under active development and these notes reflect the situation at the time of writing. If recent changes have been made they may be out of step with the Bering-uClibc 5.x code in Git.


Linux Kernel CPU Architecture Selection

The standard Linux kernel source tree includes CPU architecture specific code for quite a number of CPU types. This code is in the "arch" directory within the kernel source tree and it is sensible to review the contents of this directory. If you have not already extracted the Kernel source run:

./buildtool.pl source kernel
cd source/linux/linux-3.2.*/arch

Each of the directory names under "arch" represents a fundamental "architecture" variant. The Bering-uClibc toolchain references this via the ARCH variable.

Note: There are a few "special cases", which include i386 and x86_64! Refer to the comments and code in source/linux/linux-3.2.*/Makefile (starting around line 174) for further details.

Since there is a sub-directory of "arch" called "arm" that is what we need to set the "ARCH" variable to when building a toolchain to target the Raspberry Pi. Details of how and where to do that are provided below.

In addition to the fundamental CPU architecture setting the kernel recognizes a further level of "machine" specification. For example, under the umbrella architecture of i386 we have the "true" i386 and also i486, Pentium 4, Geode LX etc. and it is possible to select between those when compiling a kernel.

The exact details of what "machines" can be selected vary depending on the value of ARCH:

  • For i386 there are entries in the kernel .config file like the following:
    CONFIG_M686=y
  • For arm the permissible options are governed by the names of files with names like arch/arm/mach-machinename (for example arch/arm/mach-bcmring) and then there are entries in the kernel .config file like the following:
    CONFIG_ARCH_BCMRING=y

Since different users run different machines which demand incompatible settings of the kernel .config variables the option to build for multiple machine variants has been part of the Bering-uClibc toolchain since Bering-uClibc 4.x.

The Bering-uClibc 5.x toolchain uses the variable KARCHS to specify a space-separated list of "machines" to build for using a single toolchain.

For the Raspberry Pi the relevant setting is (probably) bcmring.

Note: As will be seen by the later description of how these settings are processed there is nothing "magic" about the values in KARCHS. They are just unique string labels used to identify patch files for the kernel .config and these patch files can contain system-specific settings in addition to more generic CPU architecture settings. It may be more appropriate to choose a "system" name like alix rather than a "CPU" name like geode.


GCC and Binutils CPU Architecture Selection

The toolchain is responsible for building code for the target environment and it relies on the GCC (cross-)compiler to do most of the work.

The GNU toolset (most notably "configure") has a well-established way of identifying different target platforms by a hyphen-separated list of the key characteristics known as the "configuration name". This was initially the triplet cpu-manufacturer-kernel but is now more commonly the quadruplet cpu-manufacturer-kernel-os (though this is still often referred to as a "triplet"). For example, i486-unknown-linux-uclibc refers to:

  • an i486 CPU, installed in
  • an unknown hardware platform ("unknown" as in "we don't care whether a PC is made by HP, IBM, Dell etc."), running
  • the linux kernel, and
  • a uclibc C library-based operating system

The first field ("cpu") is of particular interested here. Having identified the Kernel CPU Architecture (ARCH) refer to the appropriate sub-page of the GCC "Hardware Models and Configurations" page in order to understand what options are available.

For example, on the ARM Options sub-page there is a definition of the permissible values for the -march command-line option to GCC and related tools. One of the permissible values is "armv6" which is an obvious match for the ARMv6 architecture which we know the ARM11 CPU family uses.

This (the setting for -march) also forms the first entry in the hyphen-separated "configuration name" string. Since the other three entries in this string are always the same for Bering-uClibc 5.x we therefore know what this full string is. For the Raspberry Pi the full string is "armv6-unknown-linux-uclibc".

The Bering-uClibc 5.x toolchain references this "configuration name" via the GNU_TARGET_NAME variable.

Since this "configuration name" captures all the characteristics of the target system which need to be hard-coded into the toolchain it is a good string to use to identify and differentiate multiple toolchains. The buildtool.pl, buildpacket.pl and buildimage.pl scripts therefore use this "configuration name" as their "toolchainname" and they set the environment variable $GNU_TARGET_NAME based on the specified (or default) toolchainname.

The setting for -march ensures that the generated code will run on all CPUs which are compatible with that CPU architecture. For example, code compiled for i486 will also run on all later x86-compatible processors. GCC and related tools make it possible to optimise code for a particular CPU while retaining compatibility with other CPUs by specifying the -mtune command-line option. The permissible values for this are specified on the same page as for -march above. For the Raspberry Pi there is an exact match for the actual CPU: arm1176jzf-s so this needs to be specified as the value for -mtune.


High-Level Toolchain Configuration

Once the required values for the ARCH, KARCHS and GNU_TARGET_NAME variables and the settings for the -march and -mtune command-line options have been identified it is time to start configuring a toolchain to target those settings. Most of the configuration is performed by editing file make/MasterInclude.mk.

The default toolchain target for Bering-uClibc 5.x is i486-unknown-linux-uclibc and this is specified as the default by the following lines in conf/buildtool.conf:

# default toolchain - override with "-t toolchain" argument to buildtool.pl
Toolchain=i486-unknown-linux-uclibc

As the comment says this can be overridden by specifying "-t toolchainname" to buildtool.pl (and "--toolchain ToolchainName" to buildpacket.pl). Alternatively the default value can be changed by editing conf/buildtool.conf. At the time of writing (2012-03-24) the tools/buildall.sh script only looks at the default setting in conf/buildtool.conf. Script buildimage.pl gets its setting of toolchainname from the relevant buildimage.cfg file since each image has to be generated with the corresponding toolchain and it doesn't make sense to override this with a command-line argument.

All of the build .pl scripts set environment variable $GNU_TARGET_NAME based on the specified (or default) toolchainname and $GNU_TARGET_NAME is used internally in other scripts and configuration files where toolchain-specific processing is required.

The main configuration file which reacts to the setting for $GNU_TARGET_NAME is make/MasterInclude.mk and this is where corresponding values for ARCH etc. must be specified. The standard make/MasterInclude.mk has a skeleton IF - THEN - ELSE structure which needs to be extended for each new toolchain target. The lines for the default toolchain look like this:

ifeq ($(GNU_TARGET_NAME),i486-unknown-linux-uclibc)
# Primary kernel architecture
export ARCH:=i386
# Space-separated list of kernel sub-archs to generate
export KARCHS:=i686 i486 geode
# Available kernel archs with pci-express support
export KARCHS_PCIE:=i686
# Arch-specific CFLAGS
export ARCH_CFLAGS=-march=i486 -mtune=pentiumpro

For the Raspberry Pi we need to add a new block of lines below that (not immediately below that but just above the "else ifeq" line for the alternate architecture):

else ifeq ($(GNU_TARGET_NAME),armv6-unknown-linux-uclibc)
# Primary kernel architecture
export ARCH:=arm
# Space-separated list of kernel sub-archs to generate
export KARCHS:=bcmring
# Arch-specific CFLAGS
export ARCH_CFLAGS=-march=armv6 -mtune=arm1176jzf-s

If other settings like -march and -mtune are required they should be appended to ARCH_CFLAGS.

You will notice other lines within the IF - THEN - ELSE structure for the default toolchain - for example:

export ac_cv_sizeof_int=4

Don't worry about copying those for the new toolchain yet; they are covered later in this document.


Kernel Configuration File

If you were to try to build the new toolchain at this point it would fail with an error message because the build scripts will not be able to locate a kernel .config patch file with the right name. (The kernel source must be processed before building the toolchain executables in order to extract the header files.)

There needs to be a file called repo/linux/Bering-KVER.config-KARCH.patch for each KARCH in KARCHS, and this file must contain "diff" output which converts the base repo/linux/Bering-KVER.config into a specific kernel .config file suitable for KARCH.

For the Raspberry Pi KARCH = bcmring so the full file name is repo/linux/Bering-KVER.config-bcmring.patch. This name needs to be added to repo/linux/buildtool.cfg and the file must be created in the repo/linux/ directory.

Constructing a suitable and fully correct patch file is non-trivial. One possible procedure is as follows:

  • Create (e.g. "touch") an empty patch file with the right name.
  • Run:
    buildtool.pl -t toolchainname source linux
  • This will recognize that the .config file is not compatible with the specified ARCH and prompt for new values for the kernel configuration variables which must be changed.
    • Run:
      tail -f log/buildtoollog
      to see the prompts from make oldconfig and answer the prompts in the shell where buildtool.pl is running.
  • Locate the generated .config file (should be source/toolchainname/linux/linux-KARCH/.config) and use that to generate the "real" patch file with a command like the following:
    diff -c ../Bering-KVER.config .config > ../Bering-KVER.config-KARCH.patch
    • This populates the actual differences into the patch file so that no prompts are displayed the next time the build is run.


uClibc Configuration File

Just like the kernel, uClibc has a .config file which needs to be tailored for the new toolchain. For uClibc the file needs to be called repo/toolchain/config.$GNU_TARGET_NAME and this is a "full" file rather than a "patch".

For the Raspberry Pi the full file name is repo/toolchain/config.armv6-unknown-linux-uclibc. This name needs to be added to repo/toolchain/buildtool.cfg and the file must be created in the repo/toolchain/ directory.

As with the kernel .config it is non-trivial to create a file with the right contents. One possible procedure is as follows:

  • Copy the file for the default toolchain and edit it to reflect the correct ARCH and the correct value for CROSS_COMPILER_PREFIX.
  • Run:
    buildtool.pl -t toolchainname build toolchain
  • This will recognize that some different options need to be selected and prompt for new values for the uClibc configuration variables which must be changed.
    • For some reason the uClibc "make oldconfig" doesn't behave the same way as the kernel "make oldconfig" and refuses to accept entries when the console input is redirected.
      • That was because "make oldconfig" specified "$(MAKEOPTS)" which runs a multi-threaded build. Now removed (no performance benefit from a multi-threaded build to this step).
    • Instead, go to the directory containing the "live" uClibc .config file and run:
      make menuconfig
      on the build host.
  • Locate the generated .config file (should be source/toolchainname/toolchain/uClibc-0.9.3*/.config) and use that as the "real" file with a command like the following:
    cp .config ../config.GNU_TARGET_NAME

Like the kernel, uClibc has both "generic" (architecture) and "specifc" (CPU) configuration entries. The "specific" entry is something like:

CONFIG_ARM1176JZF_S=y

There are references to these sort of configuration variables in source/toolchainname/toolchain/uClibc-0.9.3*/Rules.mak - for example:

CPU_CFLAGS-$(CONFIG_ARM1176JZF_S)+=-mtune=arm1176jzf-s -march=armv6

Recognize those? The implication is that the generated uClibc library will run on any armv6 processor but is optimized for the arm1176jzf-s, just like the kernel.

Note: The above is correct for uClibc 0.9.32 but the "specific" architecture configuration variables have been removed in uClibc 0.9.33.

Toolchain Build

That should be it. Running:

buildtool.pl -t toolchainname build toolchain

should create a toolchain based on the specified configuration settings.

In reality you will probably get build errors and will need to refine the contents of the kernel and uClibc .config files in order to get a successful toolchain build.

(No, it doesn't currently work for me either :-) Davidmbrooke 20:30, 24 March 2012 (UTC))

The steps performed as part of the toolchain build are described below.

Source Processing

Within conf/sources.cfg the "toolchain" source Package is declared to be dependent on the "linux" source Package so the kernel source gets processed first. In order to build the linux "source" target:

  • The kernel source .tar.bz2 file is unpacked
  • The kernel source patches are applied
  • For each entry in KARCHS
    • The generic kernel .config file is patched with the specific KARCH patch to create a specific .config file
    • The "make oldconfig" command is run (with appropriate command-line arguments)
    • The "make headers_install" command is run (with appropriate command-line arguments)
    • The generated header files are copied to the toolchain/$GNU_TARGET_NAME/usr/include/ directory

Once the "linux" source Package has been processed the "toolchain" source Package processing can start.

  • The uClibc source is extracted and the Bering-uClibc 5.x uClibc source patches are applied.
  • The binutils source is extracted.
  • The GCC source is extracted.
  • The mod-utils source is extracted (required for depmod).

Build Processing

Once the "source" processing has completed the "build" processing can start. The sequence is as follows:

  • The uClibc .config file is processed as part of "make install_headers" for uClibc.
    • This adds uClibc header files to the ones installed for the kernel above.
  • The "stage 1" binutils files are compiled.
  • The "stage 1" GCC compiler is compiled.
  • The "stage 2" GCC compiler is compiled.
  • The uClibc library is compiled.
  • The "stage 2" binutils files are compiled.
  • The mod-utils files are compiled.
  • The results of the toolchain build are copied to the staging/$GNU_TARGET_NAME/ directory.

Hints and tips for debugging toolchain build failures

  • You can get more verbose diagnostics from the uClibc build by setting environment variable "V" to either 1 or 2. See source/toolchainname/uClibc-0.9.3*/Makefile.help for more details.


Cross Compilation Challenges

One particular challenge with cross-compiling applications which use "configure" is that they try to compile and execute applications on the build host in order to infer things about the target host. Sometimes this works; sometimes it does not.

In cases where it does not work it is possible to "prime" configure's cache with the correct selections by setting environment variables to specify these. That is what all those "export ac_cv_*" lines are for. If you get errors when building applications you can try to establish which variable configure is looking for and set it appropriately in the block of lines that relate to the toolchain.

If the setting is required for all toolchains it should go in the block of "export ac_cv_*" lines near the top of make/MasterInclude.mk, outside the per-toolchain IF - THEN - ELSE logic.



Prev Up