• the gcc compiler is used directly on C and the C++ and Objective C, but also forms the backend for compiling in other languages, such as Fortran, or Pascal, or Ada


  • gcc is a GNU C compiler, but it really stands for GNU Compiler Collection.


  • You can run it as either gcc or cc and it compiles programs written in a number of languages C, C++, as well as Objective C.


  • You can directly invoke it as a C++ compiler by running it as either g++ or C++, it's very closely linked with libc, glibc is the new version, and the debugger gdb.


  • gcc extends far beyond Linux. There are working versions of it for almost every operating system you can imagine, and it can be used for cross-compilation on different architectures.


  • It's very often used on a Linux workstation to compile code that will be used on an embedded board on a completely different chipset.


  • Without gcc and glibc, there's no way Linux could have grown to what it is today.


  • There are many other languages for whom the compilers are actually based on gcc. So, if you're running Ada or Fortran or Pascal and a number of other languages, the code is first transparently converted to C and then compiled with gcc.


  • It is also possible to use gcc to compile Java using the GCJ compiling interpreter. But this is no longer used on recent Linux distributions and the package itself is considered obsolete.


  • Now, there's a number of compiling stages when you run gcc on a C program. First is the preprocessing step, the second is compiling from the preprocessing to the actual assembly code, and then once you have assembly code, you have to link it to construct an executable. There is actually a name for each step: "cpp", "gcc", "as", and "ld".


  • compiling
  • We show in a chart here the default file extensions at each step. The default output file extensions and there's a lot of different options that can be applied to each step.


  • A more recent project, the LLVMLinux project provides an alternative compiler, which can be certainly used for applications and eventually for the Linux kernel.


  • This grows out of a more general LLVM project, which grew out of the University of Illinois Champaign-Urbana.


  • It actually doesn't work very well for compiling the Linux kernel, because there are a lot of "gcc-isms" in the actual kernel source, but people are working actively hard to get it to be used.


  • There's also an Intel set of compilers that runs only on their architectures. You can download evaluation copies for free and get a non-commercial license. Once again, it's difficult to use to compile the kernel because there's a lot of language extensions used in a kernel, that really stem out of gcc.


  • However, note it is good to have more than one compiler at least for applications, because sometimes you will expose various weaknesses.


  • Because different compilers make different optimizations and they also warn about different errors or code idiosyncrasies, etc.




  • The compiled code format will be ELF (Executable and Linkable Format), which makes using shared libraries easy; the older a.out format, while obsolete (although the name a.out survives, confusingly, as the default name for an output file), may still be used if the Linux kernel has been configured to support it.


  • Here is a list of someof the main options that can be given to gcc:


  •  
    Option	Description
    -I dir	Include dir in search for included files; cumulative
    -L dir	Search dir for libraries; cumulative
    -l	Link to lib; -lfoo links to libfoo.so if it exists, or to libfoo.a as a second choice
    
    
     
    Compiler Preprocessor Options
    Option	Description
    -M	Do not compile; give dependencies for make
    -H	Print out names of included files
    -E	Preprocess only
    -D def	Define def
    -U def	Undefine def
    -d	Print #defines
    
    
     
    Compiler Warning Options
    Option	Description
    -v	Verbose mode, gives version number
    -pedantic	Warn very verbosely
    -w	Suppress warnings
    -W	More verbose warnings
    -Wall	Enable a bunch of important warnings
    
    
     
    Compiler Debugging and Profiling Options
    Option	Description
    -g	Include debugging information
    -pg	Provide profile information for gprof
    
    
     
    Compiler Input and Output Options
    Option	Description
    -c	Stop after creating object files, do not link
    -o file	Output is file ; default is a.out
    -x lang	Expect input to be in lang, which can be c, objective-c, c++ (and some others); otherwise, guess by input file extension
    
    
     
    Compiler Control Options
    Option	Description
    -ansi	Enforce full ANSI compliance
    -pipe	Use pipes between stages
    -static	Suppress linking with shared libraries
    -O[lev]	Optimization level; 0, 1, 2, 3; default is 0
    -Os	Optimize for size; use all -O2 options except those that increase the size
    
    
  • A good set of options to use is: Make sure you understand any warnings; if you take the effort to obliterate them, you might save yourself a lot of debugging. However, do not use -pedantic when compiling code for the Linux kernel, which uses many gcc extensions.


  • Static Libraries Static libraries have the extension .a. When a program is compiled, full copies of any loaded library routines are incorporated as part of the executable.


  • The following tools are used for maintaining static libraries: ar creates, updates, lists and extracts files from the library. The command:


  •  
    $ ar rv libsubs.a *.o
    
    
  • will create libsubs.a if it does not exist, and insert or update any object files in the current directory.


  • ranlib generates, and stores within an archive, an index to its contents. It lists each symbol defined by the relocatable object files in the archive. This index speeds up linking to the library. The command:


  •  
    $ ranlib libsubs.a
    
    
  • is completely equivalent to running ar -s libsubs.a. While running ranlib is essential under some UNIX implementations, under Linux it is not strictly necessary, but it is a good habit to get into. nm lists symbols from object files or libraries. The command:


  •  
    $ nm -s libsubs.a
    
    
  • gives useful information. nm has a lot of other options.


  • Modern applications generally prefer to use shared libraries, as it is more efficient and conserves memory. However, there are at least two circumstances where static libraries are still used:


  • For programs that are used early in system startup, before the tools to work with shared libraries are fully operational.


  • For programs that want to be completely self-contained and not have to deal with potential problems from system updates of libraries that are utilized by the application. This is typically done by either proprietary (and even closed source) application suppliers, or by other large vendors, such as Google or Mozilla.


  • This is always controversial, because it is up to the vendor to make sure that any security holes or other bugs are fixed when they are discovered upstream in the included libraries.




  • A single copy of a shared library can be used by many applications at once; thus, both executable sizes and application load time are reduced.


  • Shared libraries have the extension .so. Typically, the full name is something like libc.so.N where N is a major version number.


  • Under Linux, shared libraries are carefully versioned. For example, a shared library might have any of the following names:


  • libmyfuncs.so.1.0 - The actual shared library.


  • libmyfuncs.so.1 - The name included in the soname field of the library. Used by the executable at runtime to find the latest revision of the v. 1myfuncs library.


  • libmyfuncs.so - Used by gcc to resolve symbol names in the library at link time when the executable is created.




  • The concept of "Write Once, Use Anywhere" was a motivating factor for the development of the Java language by Sun.


  • Sometimes people say "Write Once, Run Everywhere" instead.


  • So, the idea is you could develop your Java code on any machine that has a JVM, a Java Virtual Machine, then use it on any other platform that also has a JVM.


  • And then, once you compile your Java program into bytecode, it should run on any hardware, from a cellphone to a mainframe, on any operating system without further rewriting.


  • Now, the reality is that there are multiple JVM implementations available, there are ones from Oracle and Sun, openjdk, and IBM.


  • And there are many operating systems, and in particular, in a system like Linux where there are many distributions, each distribution has different versions, there may be an imperfect ideal.


  • So, you would still have to check whether or not your Java programs work properly on a number of platforms and not just one.


  • But that statement was more true years ago. JVM implementations are much more stable now and matured, and standardized compliance is much better.


  • And therefore, you don't have to worry as much. And also, the actual choice that platforms a particular application will run on is more limited.


  • If you write something for a small embedded type device, you may not have to run it on a mainframe, for example. So, having this Java abstraction layer is a very high advantage.


  • It's always worked better on the server side than on the client side, because on the client side, you have human interface standards and human experience in desktop/window managers.


  • And that's more what the actual user sees, and that can differ more than on the server, which is hidden from the user.




  • In the early days of Linux, obtaining a proper Java installation and getting it to work could be quite difficult.


  • This was due to a combination of technical/licensing or political/philosophical issues. As a result, the early implementations had serious deficiencies and tended to not be up to date with official releases from Oracle/Sun.


  • Today, the situation is both easier and more stable. All major Linux distributors prepackage and configure a JRE (Java Runtime Environment) and/or a JDK (Java Development Kit); the JRE is usually part of a default installation, and the JDK is easily added if necessary.


  • For example, on Red Hat Enterprise Linux and similar distributions, it is sufficient to do:


  •  
    $ sudo yum install java-1.8.0-openjdk
    
    
  • to get the JRE, and:


  •  
    $ sudo yum install java-1.8.0-openjdk-devel
    
    
  • to get the full JDK. You may have to update the version number, of course. On recent Ubuntu distributions, you can do:


  •  
    $ sudo apt-get install default-jre default-jdk
    
    
  • or


  •  
    $ sudo apt-get install openjdk-8-jre openjdk--jdk
    
    
  • which will accomplish the same task since openjdk is the default.




  • It is possible that you will have (or want to have) multiple versions of Java components installed on your system. While you can directly invoke a specific choice without disturbing the general system configuration, it is also possible to set the system default in an easily reversible way.


  • In order to enable this, most Linux systems use the alternatives tool to set the system default. This utility is used for many alternative setting tasks on the system, not just those which are Java-related.


  • For example, to reconfigure the choice for Java on a Red Hat-based system:


  •  
    $ sudo alternatives --config java
    
    There are 2 programs which provide ’java’.
           
      Selection    Command
    -----------------------------------------------
       1           java-1.7.0-openjdk.x86_64 \
                          (/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.161-2.6.12.0.el7_4.x86_64/jre/bin/java)
    *+ 2           java-1.8.0-openjdk.x86_64 \
                          (/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-0.b14.el7_4.x86_64/jre/bin/java)
                          
    
  • Enter to keep the current selection[+], or type selection number: How to use this is pretty obvious.


  • On Debian-based systems, such as Ubuntu, the command is update-alternatives. How this works is pretty straightforward. The directory /etc/alternatives contains symbolic links to the proper location:


  •  
    $ ls -l /etc/alternatives/java
    
    lrwxrwxrwx 1 root root 73 Jan 19 07:02 /etc/alternatives/java -> \
         /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-0.b14.el7_4.x86_64/jre/bin/java
         
    
  • Note that the generic Java binary is also just a symbolic link:


  •  
    $ which java
    /usr/bin/java
    $ ls -l /usr/bin/java
    lrwxrwxrwx 1 root root 22 Jan 19 07:02 /usr/bin/java -> /etc/alternatives/java
    
    
  • You can also set this for the Java compiler, javac, as in:


  •  
    $ sudo alternatives --config javac
         
    There is 1 program that provides ’javac’
           
     Selection    Command
    -----------------------------------------------
    *+ 1           java-1.7.0-openjdk.x86_64 \
                     (/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.161-2.6.12.0.el7_4.x86_64/bin/javac)
    
    
    <
  • Enter to keep the current selection[+], or type selection number: To just change versions for a specific user, you could put something like this in the $HOME/.bashrc file:


  •  
    export JAVA_HOME=/usr/lib/jvm/java-1.6.0-sun-1.6.0.21.x86_64/jre
    export PATH=$JAVA_HOME/bin:$PATH
    
    
  • You can always see what version of Java is actually being run in any environment by doing


  •  
    $ java -version
    
    
  • Environment Variables and Class Paths If you do:


  •  
    $ java -version
    
    
  • and things work fine, your basic path is set correctly as described in the alternatives section. You might also try:


  •  
    $ readlink -f $(which java)
    
    
  • which will follow the links to print out the path to the actual installation.


  • The CLASSPATH environment variable can be used to locate user classes that are in addition to those directly part of the jdk or jre installation and extensions.


  • While you can always set this variable in your .bashrc initialization file, this becomes associated with the environment only of the particular user and is effective at all times.


  • It is thus generally better to use the -cp option when invoking Java with the application: this could be made part of a startup script, for example.




  • Many developers are accustomed to working in integrated development environments, and there is no shortage of available ones in Linux. We will focus here on Eclipse and NetBeans.


  • However, there's actually a pretty long list of possible development environments available on Linux.


  • A lot of Linux developers prefer to work more at the command line, or they can even use emacs as essentially an integrated development environment. Since it has built-in support for editing, debugging, and all sorts of other things you would expect to find in an integrated environment.


  • Eclipse can be traced back to at least 1984. Since 2001 it has been an open source project, and it's been under the Eclipse Foundation since 2004. It's released under its own license, the Eclipse Public License.


  • It's a little bit incompatible with the GPL, but this has no effect on any product you develop using Eclipse. It only causes some complications if you want to extend or modify the IDE itself.


  • While some distributions still have Eclipse in their package management systems, the accepted method of installation today is, to download the latest version directly from the Foundation's website and install it.


  • This is very standard and lot of distributions no longer give you any other way to do it. Eclipse is very modular. You don't have to install any more of it than you want, or for languages you don't need.


  • Originally, primarily for Java projects, but now it will work on almost any language. If you've used Eclipse on another operating system, your experience will be the same on Linux.


  • NetBeans goes back to 1997. It has passed through Sun and Oracle, and today it's released under CDDL and the GPL version 2 licenses. You'll find it in some distributions, in their standard packaging system.


  • So, one command line might actually install it, or for other distributions like Eclipse, you should go back to the source website and just download and install, and it's pretty straightforward procedure.


  • There are a lot of choices about what you have to download and install. It can range from 31 megabytes to 239 megabytes the last time we checked, depending on what you decide to include, and it's pretty easy to do the install.


  • You can just type "netbeans" on a command line to run it, or there are graphical menus from which you can launch it. The same thing is true in Eclipse.


  • Once again, if you have used NetBeans on another operating system, it'll look the same on Linux, and there should be no real learning curve involved.




  • One of the most important functions of any Linux distribution is package management.


  • Now, originally, most Linux distributions were simply collections of tarballs - archives of either binaries or sources which needed to be compiled in order to install various applications and other system services, etc.


  • And you can still find distributions, including Slackware, which is one of the oldest distributions, if not the oldest distribution, which is still operating, that still function that way.


  • But there are disadvantages to distributing things this way. It can be difficult to remove all the files from a software package, if you need to remove it.


  • You might delete some package that other packages need.


  • Or install a package that won't work, because it needs other packages that haven't been installed yet.


  • A developer can easily lose track of exactly what was put into the package, what sources there were, what patches or changes were made, etc.


  • And, when you do updates or upgrades of the system, you can run into problems, because you may leave debris on the system you will no longer need.


  • If you do the upgrade while the system is running, you can cause all sorts of horrendous problems, even a crash. The order of upgrading packages might be important, so you have to think about it.


  • And, if you have a whole group of packages that requires simultaneous updates, you can get into trouble and updates may conflict with each other. So, there are a number of systems which are in operation to handle this in a coherent way.


  • The two most common are RPM and APT.


  • RPM stands for the Red Hat Package Manager; while originally it was developed by Red Hat, it's used by other distributions as well, even ones which aren't directly related to Red Hat, such as SUSE and OpenSUSE.


  • APT, the Advanced Packaging Tool, produces the so-called Debian packages.


  • It came, not surprisingly, from the Debian distribution, but it's the foundation of Ubuntu, and Linux Mint, and other distributions which are built on top of Debian.


  • The other packaging systems use, for instance, Gentoo uses Emerge, and Portage, and Pacman. Arch Linux uses Pacman. But these are the two main ones that you're liable to run into, especially on a elementary level.


  • There's a tool called "alien" which can convert packages between RPM and Debian, but it's rather imperfect.


  • So, you may find it useful as some kind of a guide, but don't depend on it being perfect.


  • Now, what are some of the advantages for developers of using a coherent package management system? One, you can have repeatable builds, so that no matter how something is installed, it comes out the same.


  • You can keep track of dependency data, you know what packages need other packages.


  • You can include the original sources and you can also show all the steps involved in, perhaps modifying them, recompiling, and doing some configuration control.


  • And, you can document any explanation of changes that are made to the upstream sources.


  • For system administrators and end users, there are also many benefits. You can easily verify the integrity of installation, see if any files have been modified, for example.


  • It's easy to install and remove packages. And, when you do this, you can preserve any customized and modified configuration files you may have changed. You can check on install and remove that you're not getting rid of any resources that you really need to keep.


  • And, you can easily do things like find out what files are in a package or look at a given file and ask what package it belongs to. The maintenance of software by using package systems is a critical task on Linux distributions.


  • Everything else you do requires a coherent system. One thing you want is a healthy connection to upstream developers.


  • You want to be able to feed back to them critiques and suggestions about how the thing is running, report bugs, etc. But you also need a method of deploying changes made by the upstream developers back to the end users and system administrators.


  • You can establish clear policies for how configuration files are dealt with. You can make sure that you don't have bit rot, that packages are updated and upgraded in timely fashion, and bugs and security holes are not allowed to proliferate.


  • And maybe the most important thing to make sure you get all the package dependencies correct. So that, if something changes, anything that depends on is also explicitly updated, as well. And, one thing to keep in mind is that Linux distributions include software from many sources.


  • It's a critical job of a distributor to be able to make that all meshed together. It's a very difficult thing to do, it's not easy and distributors have a lot of experience in making sure this works.


  • Package maintainers are often employees of distributions, especially for commercial ones as Red Hat or Ubuntu, but there are many who are individual volunteers, and that's particularly true with distributions like Debian.


  • In any case, maintainers do need to think long-term. How can they make sure that the way they build their packages will still work when you have updates and upgrades or the use of different distributions, etc.?