The fact that all these functions satisfy the conditions is easy to verify. Here is an outlined sketch of a proof that there are no other examples.
Step 1: Prove that the line segment between any two points A and
B is mapped into the line segment between f(A) and
f(B).
This is done by considering a point C on the extension of the line
segment. If the closed segments AB, AC and BC are all
mapped into closed line segments, then the claim must hold true.
Step 2: Prove that f-1 must satisfy the same
conditions as f.
Trivial from step 1.
Step 3: Prove that f is continuous.
This can be proved by definition. Consider what the neighborhood around a point,
A, gets mapped to by f and what the neighborhood around
f(A) gets mapped to by f-1.
Step 4: Consider f(<1,1>), f(<1,2>)
and f(<2,1>) and show that from these it is possible to find out
where f(<1,∞>) and f(<∞,1>) are
(asymptotically) mapped to.
(Note: Throughout this proof-sketch I use terms such as
f(<1,∞>) and f(<∞,1>), with or without
the word "asymptotically", in a semi-formal manner. To be strict, a better way
would have been to explore
{f(<1,x>)|x≥1} and {f(<x,1>)|x≥1}
This and other formal wordings for this mathematical object would have made the
proof-sketch cumbersome and difficult to follow. I've opted to use the loose
notation f(<1,∞>), sometimes qualified by the equally loose
wording "asymptotic", to convey the idea that we know in what direction the
line going out from f(<1,1>) to any f(<1,x>)
is directed,
and the question of where "f(<1,∞>)" lies is only a question
of how much this line can be elongated, which may either be to some bounded or
to an unbounded length.
In the proof-sketch for this (the current) step, I go into a
little detail regarding what I mean by this notation, but generally I
let it stand a is.
My apologies go to readers more pedantic than me, who may need to
translate the arguments given below back to formalistic wordings in order to
make sense out of them. So, Informally:)
W.l.o.g., let us consider f(<1,∞>).
To reach f(<1,∞>) asymptotically, one must map
<1,x> for increasingly large values of x. The image of
the line segment between <1,1> and any <1,x> must always
be a line segment, so all f(<1,x>) must lie on a single
line. The point where f(<1,x>) converges to can be
infinitely far from f(<1,1>) or not. However, due to continuity
we know that if it is not in infinity, it must be along the axes. Therefore,
the direction of the vector from f(<1,1>) to
f(<1,2>) resolves the asymptotic position of
f(<1,∞>) uniquely.
Step 5: Show that if f(<1,∞>) diverges, then
f(<x,∞>) diverges for any positive x, and that
all line segments parallel to the Y-axis get mapped to line segments parallel
to each other. Alternatively, if f(<1,∞>) converges then all
f(<x,∞>) converge to the same point. (And,
equivalently, regarding f(<∞,1>) and horizontal
lines.)
If this were not true then one could construct
a line on the image of f that intersects the
image of one vertical line but not of another vertical line. This forms a
contradiction because the image of this line in f-1 would
suggest that it is parallel to some vertical lines, but not to all. This
process can also be done on f-1 instead of f.
Step 6: Show that all values of f can be determined
uniquely, using only the three values of it from step 4.
In step 5 we showed that either all vertical/horizontal lines are mapped into
parallel lines, or they are all mapped into lines that converge to a single
point (at infinity). This means that given f(<1,1>),
f(<1,2>), f(<2,1>), f(<1,∞>)
and f(<∞,1>) we are able to
First, note that we can construct an arbitrarily dense lattice of points we know the mapping of. For example, by passing a horizontal line through <1,2> and a vertical line through <2,1> we can construct f(<2,2>). Then, by intersecting the line between <1,1> and <2,2> with the line between <1,2> and <2,1> we can find f(<1.5,1.5>). Passing axis-parallel lines through <1.5,1.5> we can then construct f(<1.5,2>), f(<2,1.5>), f(<1.5,1>) and f(<1,1.5>), thereby having made the lattice twice as dense. Repeating this procedure on each of the four quarters created can make this lattice arbitrarily dense.
Second, note that the lattice can also be extended at will. For example, by passing a line through <2,1.5> and <1.5,2> we can determine f(<1,2.5>) and f(<2.5,1>), allowing us to start with a square whose side is 1.5 times as large as the original square.
By repeating such extrapolation and interpolation procedures we can find the mapping of points arbitrarily close to any desired point. The rest of f can be determined by the constraint of continuity.
Step 7: Put it all together.
Suppose that we have f(<1,1>),
f(<1,2>), f(<2,1>),
and the lengths from f(<1,1>) to f(<1,∞>) and
f(<∞,1>).
Even without the additional constraints
introduced in Step 4, using only the conclusions of steps 5 and the methods
introduced in step 6 we can
determine the entire function f. This function can be written explicitly
as f(<x,y>) =
<(a1x+b1y+c1) /
(a3x+b3y+c3),
(a2x+b2y+c2) /
(a3x+b3y+c3)>.
A general function of this type is, however, not a bijection because it is
either not one-to-one or is not onto. The only invertible functions of this
type are the ones listed as possible solutions. This can be seen from the
fact that if line segments are mapped into line segments then triangles are
mapped into triangles. Non-formally, the argument can then be completed as
follows: The entire plane can be viewed as the inside of the triangle
<0,0>, <∞,0>, <0,∞>. For f to be a
bijection, this triangle must be mapped into itself, so the three vertices of
this triangle must be mapped to themselves, in some permutation.
This gives exactly 3!=6 possible solutions, each of which is a
unique solution up to multiplication of each coordinate by a positive scalar.
The same proof can easily be extended to n dimensions, and for exactly the same reason the number of solutions in n dimensions, up to multiplication of each coordinate by a positive scalar, is (n+1)!. These solutions can most easily be described in projective space:
Instead of looking at f as a transformation from (ℜ+)n to (ℜ+)n, consider it as a transformation from (ℜ+)n+1 to (ℜ+)n+1, where both domain and range are considered under the equivalence relationship x ≡ αx. (Input and output can be transformed between (ℜ+)n space and (ℜ+)n+1 space by adding a '1' as a dummy n+1'th coordinate.)
In this new representation, all f's, up to a scaling transformation in each coordinate that is done in (ℜ+)n space, is simply an order permutation on the coordinates of the input.