What's "porting software"? What's involved?

What's "porting software"?

I've not done much programming other than putting #INCLUDE <stdio.h> into a file called helloworld.h and write #INCLUDE helloworld.h and printf("hello world"); into a file called helloworld.c, and run cc helloworld.c, and then ./a.out (and even then I forgot a standard library), and some PHP (which I quite like) (I'm not very object-oriented, I'm more procedural)...

So as far as I know, a software that's programmed, if compiled for the target CPU/hardware, should work regardless of the operating system...

I'm talking a purely console-based program here...

But the reality is that software has to be ported... like my program would probably work on DOS, Windows, Linux, Solaris, BSD4.4, and FreeBSD, but more complex programs have to be ported from one operating system to another...

Why? If the libraries are all the same everywhere, and the principles that apply to my simple program above expand to fit libraries and stuff, howcome software has to be ported?
 
If the libraries are all the same everywhere, and the principles that apply to my simple program above expand to fit libraries and stuff, howcome software has to be ported?
Because we don't live in an ideal world and those libraries aren't the same everywhere.
 
Even console stuff has the risk of being written in a non-portable way. For example a developer may opt to use GNU getline or ncurses. This makes it quite a bit more tricky to port to Windows without some fiddling or 3rd party "compatibility" libraries such as pdcurses.

In general if you stick to only the standard C library, your application will be quite limited in how it interacts with the OS or the user but will be mostly portable. The only risk then are architecture specific differences such as bit endianness.

I think the best solution is creating standards like POSIX. Unfortunately not all operating systems adhere to them (and very rarely 100%). Also you have sh*tty operating systems like Android or iOS where they are in essence POSIX compliant but are so artificially crippled that much of the "porting" effort is to tiptoe around the limitations that the sh*theads who develop them have set.

Unfortunately I believe C (and to some extent C++) is the best solution to portability. If you have ever had the pleasure to port something like Java or .NET to a new platform, you will understand that these solutions are the exact opposite of cross platform.
 
The problem with standards is that there are so many of them.

POSIX is helpful but can be quite ambiguous at times. When things are defined as SHOULD or COULD there's already a chance for differences in the implementation.
 
Unfortunately I believe C (and to some extent C++) is the best solution to portability. If you have ever had the pleasure to port something like Java or .NET to a new platform, you will understand that these solutions are the exact opposite of cross platform.
Have to disagree there, it all depends on the software and how it was made.

In fact, it's the other way around. If you stick to only the standard libraries your application will be extremely flexible in how it interacts and will be fully portable. If you use libraries which are usually tied to one OS then yes, you may run into problems. But that has nothing to do with the environment but more so with the programmers. Generally speaking Java is much more cross platform.
 
Generally speaking Java is much more cross platform.

You're kidding me right? The Java VM is hundreds of thousands of lines of code. Just check out how many patches it requires to compile on FreeBSD (https://svnweb.freebsd.org/ports/head/java/openjdk8/files/). And that is *after* Java 8 has been "officially" ported to FreeBSD.

If I was given a task of porting either a C program or a Java program to something like iOS or Arduino (which have no native Java VM available), I would certainly rather port the C program.

I think the days where we could safely assume that our target platform can run a JavaVM are quite far behind us. Even with things like Android that provide Java but a very different standard library, Java is severely limited when it comes to portability (no Javax.swing GUI system for example).
 
You're kidding me right? The Java VM is hundreds of thousands of lines of code.
Aaah, ok, porting the VM itself, that I agree with. I was under the impression that you were talking about Java software, so software written for and developed in Java. And that is seriously different, even though it also shares some similarities.
 
Unfortunately I believe C (and to some extent C++) is the best solution to portability.
In theory, you MIGHT be right. If the C/C++ is written to be completely ANSI compliant (don't use any undefined or platform-specific behavior of the compiler, which is hard, since even creating arrays on the stack is a gnu extension), and it only uses POSIX-defined system calls, then it can be ported very easily to any platform that has an ANSI compiler and is POSIX compliant.

In practice, that scenario does not exist. The C and C++ standard leave an enormous amount of stuff undefined, and compilers aren't very good about giving clear warnings when someone uses undefined behavior. And the POSIX interfaces are not useful enough to write many interesting programs (unless you are writing simple filters that process data and don't interface with the OS in any other way).

In my experience, Java is easier to port, since a much larger fraction of the language and of the utilities are clearly defined. That is until you get into GUI stuff, where "porting" also requires "making it look acceptable on different devices". I have never used .NET. Modern scripting languages (Python, Ruby, ...) are even easier, since they make good engineering simpler.

But the real answer is that portability is one of the things that needs to be designed into the software development process. If you are intending to eventually port the software to other platforms (even if it is Unix -> Unix, like Linux/AIX/*BSD/HP-UX/Solaris), you need to design your system with that in mind. That for example means having very clear coding standards that prevent using undefined/platform-specific behavior of the compiler, and abstracting OS-interfaces that are not 100% guaranteed by POSIX in a single place so they are easy to find and change when the next OS needs to be supported. Even little things matter: If you want portability, then write super-simple and logical Makefiles, since Gnu make is not preferred or available on all platforms.
 
Because we don't live in an ideal world and those libraries aren't the same everywhere.

Why though? If the libraries are based on the standard libraries, why doesn't it make like a fractal and so things are compatible cross-platform?

Again, I'm just talking low-level stuff, nothing graphical, nothing that uses any kind of kits or APIs...

I just can't understand that, considering that the very first libraries were based on the most basic library, and that that very basic library is portable, howcome it doesn't daisy-chain, and make it so everything's portable?

Also, nobody answered my question... I still don't know what's involved...
 
Why though? If the libraries are based on the standard libraries, why doesn't it make like a fractal and so things are compatible cross-platform?
Implementation differences. If you give 10 programmers something to solve you will get 10 different solutions.

Again, I'm just talking low-level stuff, nothing graphical, nothing that uses any kind of kits or APIs...
You can't do anything withing touching an API. Even a basic printf will, at some point, call an API to actually print something.
 
In my experience, Java is easier to port, since a much larger fraction of the language and of the utilities are clearly defined..

True in a lot of situations, and I see where ShelLuser was also coming from, however Java is one of those things where it can suddenly become impossible to port and a rewrite (in a different language) is required. Whereas C / C++ are never particularly nice to port but with enough man-hours spent, are always able to be ported.

The only way that C/C++ programs can become impossible to port is in a few rare circumstances. Such as it uses a garbage collector (i.e boehms). Yes, very portable to most platforms but some rare platforms can not have certain assumptions made about them i.e a contiguous heap or a stack (such as I believe some Pic chips?).
Another one is if it relies on Virtual memory (such as VirtualAlloc / mmap). Again, not all platforms support this.

And the big issue is that the JavaVM itself relies on both a GC and virtual memory making it finite in terms of portability.
 
Real-world example: On FreeBSD, there is no emacs support on the Raspberry Pi, because it uses a very strange way of doing memory allocation. Emacs works on FreeBSD on most other platforms, and works on the Pi using Linux, but that one combination doesn't.
 
Heh or worse, the standard Hello World program for C from the K&R "The C Programming Language" book is not 100% working on Windows.

Code:
#include <stdio.h>
int main(void)
{
   printf("hello, world\n");
   return 0;
}

On Windows, a new line is \r\n and not just \n. Therefore this program will end up printing something like:

hello, worldC:\>

rather than

hello, world
C:\>
 
The first time I saw a C compiler on a mainframe (an IBM 370), I had to swallow hard: That machine doesn't have a stack! How do you run a language that's by definition recursive? Turns out it's easy: Use one of the registers as a stack pointer. Very bizarre.

A few years later I discovered that Apache (the well-known web server) has been ported to run on mainframe's TPF operating system. Now, referring to "TPF" as an operating system is a bit of a joke: it is nothing but an ultra-efficient transaction processing facility, used exclusively for high-volume transaction systems (such as airline reservations and credit card processing). TPF doesn't even have a file system, or memory allocation, or facilities like that. There are no compilers, no user interfaces (other than an operator console). How the heck did they get Apache to work on it? I don't know, but it must have used a serious amount of genius, stupidity, and crazyness.
 
The first time I saw a C compiler on a mainframe (an IBM 370), I had to swallow hard: That machine doesn't have a stack! How do you run a language that's by definition recursive? Turns out it's easy: Use one of the registers as a stack pointer. Very bizarre.

A few years later I discovered that Apache (the well-known web server) has been ported to run on mainframe's TPF operating system. Now, referring to "TPF" as an operating system is a bit of a joke: it is nothing but an ultra-efficient transaction processing facility, used exclusively for high-volume transaction systems (such as airline reservations and credit card processing). TPF doesn't even have a file system, or memory allocation, or facilities like that. There are no compilers, no user interfaces (other than an operator console). How the heck did they get Apache to work on it? I don't know, but it must have used a serious amount of genius, stupidity, and crazyness.

I think C is the language that's closest to ASM, so it's probably that they managed to translate Apache into very basic machine code from its C source (I think it's written exclusively in low-level C to make it as fast as possible, like PHP, if I'm not mistaken...)

Also I didn't know TPF existed... is it still a thing?

I always thought credit cards and stuff like that just ran on distributed clusters, like Google... that you could essentially run a credit card company with a big enough computer and MySQL...
 
The first time I saw a C compiler on a mainframe (an IBM 370), I had to swallow hard: That machine doesn't have a stack! How do you run a language that's by definition recursive? Turns out it's easy: Use one of the registers as a stack pointer. Very bizarre.
AmigaOS on 680x0 does something similar. The A7 address register doubles as a stack pointer. A6 was typically used as the base address of a library in order to JSR into library functions. I did a lot of assembly on the Amiga, and I still consider it one of the easiest to program in assembler :D
 
I think C is the language that's closest to ASM, ...
Well, yes and no. C is like ASM in that it is very primitive. It is not like C in that in pre-supposes a certain function call model, with certain parameter passing conventions (pass-by-value and by pointer), and the basic assumptions of recursive calls. It pretty much assumes that you have a stack. Older programming models (such as COBOL and Fortran) do not assume that the hardware has a stack, and are explicitly not defined to be recursive (although on modern hardware, they usually allow recursive calls).

Also I didn't know TPF existed... is it still a thing?
I didn't actually know whether it is still available until a momoment ago. According to Wikipedia, it is still shipping, and getting updated.

I always thought credit cards and stuff like that just ran on distributed clusters, like Google... that you could essentially run a credit card company with a big enough computer and MySQL...
You got to be kidding. No bank would ever run their transaction processing on MySQL and on a cluster with commodity hardware and commodity operating systems. To begin with: Where is the indemnification? If something goes wrong, and the bank has a serious disaster, it has to be able to get financial satisfaction from its OS, middle-ware and hardware vendor. You can sue Oracle, HP and IBM; you don't have to bother suing RedHat or some no-name chinese vendor of rackmount computers (there isn't enough money there). MySQL is an interesting case, as it is now owned by Oracle.

The next question is support. Any of the companies like Oracle, IBM and HP have tens of thousands of people in their support departments. If something goes wrong, any of those three can send dozens or hundreds of specialists to fix the problem, and perform recovery. That's something that doesn't exist at that scale in the FOSS or commodity ecosystem.

Real back=end bank processing is done mostly on mainframes (IBM-370 instruction set machines, sold either by IBM or by various Japanese competitors), high-end Unix servers (AIX, HP-UX, Solaris, all on vendor-specific hardware), and to a diminishing amount on other systems (some banks still use VMS, although not on VAXes any longer, or HP-3000 systems for smaller banks, or IBM i-Series a.k.a. AS-400, or Tandem Non-Stop servers). Financial institutions are famously risk-averse and old-fashioned.

Now don't get me wrong: big banks (and financial companies also have giant clusters of commodity machines, often using Linux and free middleware. They use Hadoop, various NoSQL databases, Lustre, and so on. But that's for analysis, not for back-end transaction processing.

And to SirDice's comment: Indeed, the 68000 is very pleasant for assembly programming. I also liked doing assembly on the VAX. Not fond of doing it on the IBM 370, the instruction set is too weird (note that this doesn't apply to the mainframe models that ship today, they have 64-bit instruction sets that are very rich, it applies to the 370 that shipped in the 1970s). The high-end CISC machines are just nice that way. RISC was really the end of assembly programming.
 
Back
Top