% set x [expr {1<<(2**31-1)}]; concat
% set x [expr {$x<<(2**31-1)}]; concat
% set x [expr {$x<<(2**31-1)}]; concat
% set x [expr {$x<<(2**31-1)}]; concat
% set x [expr {$x<<(2**31-1)}]; concat
% set x [expr {$x<<(2**31-1)}]; concat
% set x [expr {$x<<(2**31-1)}]; concat
Program received signal SIGSEGV, Segmentation fault.
0x000000000046dfca in TclBN_mp_grow (a=0x7fff9fd04f20, size=536870920)
at /home/dgp/cvs/tcl8.5/unix/../libtommath/bn_mp_grow.c:48
48 a->dp[i] = 0;
(gdb) print a
$1 = (mp_int *) 0x7fff9fd04f20
(gdb) print a->dp
$2 = (mp_digit *) 0x861fc68
(gdb) print i
$3 = 460175076
(gdb) print a->alloc
$4 = 536870920
(gdb) print a->dp[i]
Cannot access memory at address 0xe3cfb388
(gdb) print size
$5 = 536870920
Suspect there's been an overflow in the
second argument to XREALLOC() a few
lines earlier, and we didn't really request
as much memory as we actually need.
Tcl's use of libtommath sets mp_digit
to be 'unsigned long' which on my test
system is 4 bytes, so sizeof(mp_digit) * size
is 2147483680, which is bigger than INT_MAX.
If something in the machinery giving meaning
to the XREALLOC() macro tries to force it
through a signed int, I'd expect trouble at this
point. Can kbk (or anyone) shed light on just
what XREALLOC() means in this context ?
Yup, XREALLOC gets defined in terms of
Tcl's ckrealloc routine which has the well
known handicap of using signed ints for
allocation size.
Hmm... perhaps not. Tcl_Realloc() actually
takes an "unsigned int" argument, which
ought to still be ok.
d'oh! I wasn't on the system I thought.
Where I was, sizeof(unsigned long) is 8.
Hmm. on such systems, shouldn't we
revise the mp_digit so that we don't use
64-bit integers to store only 28 bits of
data?
I must still be missing something.
confirmed. On a Linux x86_64 system,
Tcl's default configuration sets
sizeof(mp_digit) to 8, so that the
dp[] array is made up of 64-bit
integer elements, and also sets
DIGIT_BIT to 28, so that we are
only using 28 out of each 64 bits
to represent the numeric value.
Over 50% waste!
Some parts of the Tcl public interface
accept (mp_int *) arguments. Are we
still free to change these settings without
creating ABI incompatibilities there?
Even if we are not, can we plausibly
argue even in an 8.5.8 release that
what we are really doing is fixing a bug
in the implementation?
After unwinding the header tangle,
the typedef that controls is the one
found on line 2187 of tcl.h.
Changing it means changing the
public interface, at least on those
platforms where TCL_WIDE_INT_IS_LONG
but it still seems to me like the right
choice to make.
Changing line 2187 of tcl.h to
typedef unsigned int mp_digit;
permits the left shifting to continue
several more times. On the 14th
shift, we get the segfault again, but
this time because the actual limits
of the mp_int struct are being exceeded.