Thread from comp.lang.tcl (7 replies)

2GB limitation
Posted by alexandru.dadalau@meshparts.de (alexandru) 2 months 1 week ago

Hi,

Will there be a fix for the 2GB size limit that a string representation
have in Tcl?
Maybe already fixed in Tcl 9.0?

Thanks
Alexnadru

Click on article to view all threads in comp.lang.tcl
Re: 2GB limitation
Posted by Andreas Leitgeb <avl@logic.at> 2 months 1 week ago

alexandru <alexandru.dadalau@meshparts.de> wrote:
> Will there be a fix for the 2GB size limit that a string representation
> have in Tcl?
> Maybe already fixed in Tcl 9.0?

Yes, that's one of the reasons for switching to tcl9 as soon as
possible.

Click on article to view all threads in comp.lang.tcl
Re: 2GB limitation
Posted by alexandru.dadalau@meshparts.de (alexandru) 2 months 1 week ago

Wow, that unexpected and cool!
Thanks

Click on article to view all threads in comp.lang.tcl
Re: 2GB limitation
Posted by Rich <rich@example.invalid> 2 months 1 week ago

Andreas Leitgeb <avl@logic.at> wrote:
> alexandru <alexandru.dadalau@meshparts.de> wrote:
>> Will there be a fix for the 2GB size limit that a string representation
>> have in Tcl?
>> Maybe already fixed in Tcl 9.0?
> 
> Yes, that's one of the reasons for switching to tcl9 as soon as
> possible.

What is the new larger "limit" in Tcl9?

Click on article to view all threads in comp.lang.tcl
Re: 2GB limitation
Posted by Emiliano <emiliano@example.invalid> 2 months 1 week ago

On Mon, 22 Jul 2024 17:17:21 -0000 (UTC)
Rich <rich@example.invalid> wrote:

> Andreas Leitgeb <avl@logic.at> wrote:
> > alexandru <alexandru.dadalau@meshparts.de> wrote:
> >> Will there be a fix for the 2GB size limit that a string representation
> >> have in Tcl?
> >> Maybe already fixed in Tcl 9.0?
> > 
> > Yes, that's one of the reasons for switching to tcl9 as soon as
> > possible.
> 
> What is the new larger "limit" in Tcl9?

In 9.0 the type of the 'length' member of the Tcl_Obj struct (the number of
bytes at '*bytes' member, not including the terminating null) has changed
from int to ptrdiff_t, so it will remain (1<<31)-1 => 2147483647 bytes on
32 bit platforms (unsurprisingly) and (1<<63)-1 => 9223372036854775807
(9,22 exabyte) on 64 bit platforms.

IIUC that's also the (new) number of elements for a Tcl list. In practice
the number will be less, since the length of the string representation of
such list will hit the '*bytes' max length first.

-- 
Emiliano

Click on article to view all threads in comp.lang.tcl
Re: 2GB limitation
Posted by Harald Oehlmann <wortkarg3@yahoo.com> 2 months 1 week ago

Am 22.07.2024 um 19:17 schrieb Rich:
> Andreas Leitgeb <avl@logic.at> wrote:
>> alexandru <alexandru.dadalau@meshparts.de> wrote:
>>> Will there be a fix for the 2GB size limit that a string representation
>>> have in Tcl?
>>> Maybe already fixed in Tcl 9.0?
>>
>> Yes, that's one of the reasons for switching to tcl9 as soon as
>> possible.
> 
> What is the new larger "limit" in Tcl9?

expr {2**63}

Click on article to view all threads in comp.lang.tcl
Re: 2GB limitation
Posted by Andreas Leitgeb <avl@logic.at> 2 months 1 week ago

Emiliano <emiliano@example.invalid> wrote:
> In 9.0 the type of the 'length' member of the Tcl_Obj struct (the number of
> bytes at '*bytes' member, not including the terminating null) has changed
> from int to ptrdiff_t, so it will remain (1<<31)-1 => 2147483647 bytes on
> 32 bit platforms (unsurprisingly) and (1<<63)-1 => 9223372036854775807
> (9,22 exabyte) on 64 bit platforms.

My hearsay was "generally 64 bit (minus the sign-bit)".
Are you sure that length-type is *always* ptrdiff_t, and
that this may be 32bit?

The "64bit'ness" of a platform is also a bit more complicated...
There are platforms, where pointers are 64bit, but ints are
still 32 (despite machine words being all 64bit) - in those
cases, I'd expect ptrdiff_t to be 64 bit, but on a real old
32bit machine, I don't really know for sure...

> IIUC that's also the (new) number of elements for a Tcl list.
> In practice the number will be less, since the length of the
> string representation of such list will hit the '*bytes' max
> length first.

Not all lists are ever turned to string-rep.  While they are
semantically "just strings", well written programs can avoid
the actual obtainment of the string rep, at least for those
really long lists that may be relevant here.

Click on article to view all threads in comp.lang.tcl
Re: 2GB limitation
Posted by Emiliano <emiliano@example.invalid> 2 months 1 week ago

On Wed, 24 Jul 2024 16:22:53 -0000 (UTC)
Andreas Leitgeb <avl@logic.at> wrote:

> Emiliano <emiliano@example.invalid> wrote:
> > In 9.0 the type of the 'length' member of the Tcl_Obj struct (the number of
> > bytes at '*bytes' member, not including the terminating null) has changed
> > from int to ptrdiff_t, so it will remain (1<<31)-1 => 2147483647 bytes on
> > 32 bit platforms (unsurprisingly) and (1<<63)-1 => 9223372036854775807
> > (9,22 exabyte) on 64 bit platforms.
> 
> My hearsay was "generally 64 bit (minus the sign-bit)".
> Are you sure that length-type is *always* ptrdiff_t, and
> that this may be 32bit?

In 9.X, it is ptrdiff_t. In 8.Y is still int. 

See https://core.tcl-lang.org/tcl/file?ci=trunk&name=generic/tcl.h&ln=325-333
and
https://core.tcl-lang.org/tcl/file?ci=trunk&name=generic/tcl.h&ln=740-752

ptrdiff_t can still be a 32 bits wide value. See below.
 
> The "64bit'ness" of a platform is also a bit more complicated...
> There are platforms, where pointers are 64bit, but ints are
> still 32 (despite machine words being all 64bit) - in those
> cases, I'd expect ptrdiff_t to be 64 bit, but on a real old
> 32bit machine, I don't really know for sure...

This is what I mean when say "on 32-bit platforms is still 2GB",
since i386-i686 platform has a 32 bit ptrdiff_t.

On my ancient i686 machine:

$ uname -m
i686
$ tclsh9.0 
% expr {(1 << (8 * $tcl_platform(pointerSize))-1) - 1}
2147483647
% package provide Tcl
9.0b3
% set tcl_platform(pointerSize)
4

> > IIUC that's also the (new) number of elements for a Tcl list.
> > In practice the number will be less, since the length of the
> > string representation of such list will hit the '*bytes' max
> > length first.
> 
> Not all lists are ever turned to string-rep.  While they are
> semantically "just strings", well written programs can avoid
> the actual obtainment of the string rep, at least for those
> really long lists that may be relevant here.

Yes, but that's an optimization. Tcl semantics are still defined
in terms of strings operations. I prefer not to depend on internals.

-- 
Emiliano

Click on article to view all threads in comp.lang.tcl