Menu

#1197 IO_PERFORMANCE on NT 80 times slower then on unix

obsolete: 8.0.3
closed-fixed
6
2001-09-07
2000-10-26
Anonymous
No

OriginalBugID: 948 Bug
Version: 8.0.3
SubmitDate: '1998-12-14'
LastModified: '2000-06-22'
Severity: MED
Status: Assigned
Submitter: pat
ChangedBy: hobbs
OS: Windows NT
Machine: X86
FixedDate: '2000-10-25'
ClosedDate: '2000-10-25'

Name: Uwe Traum

ReproducibleScript:

proc dotest {{filename test.bin}} {
set fid [open $filename w]
fconfigure $fid -translation binary

for { set i 0 } { $i < 2000 } { incr i } {
set ind [expr {128*int(rand()*30000)}]
#seek $fid $ind start
puts -nonewline $fid "123456789012345678901234567890"
}
close $fid
}
time dotest 3

[rewritten by hobbs as proc]

ObservedBehavior:
Output:

NT4;local disk;PentiumPro 200: 155172000 microseconds per iteration
Solaris2.5;local disk;sparc20: 1844860 microseconds per iteration

on unix it's 80 time faster than on NT!!!

DesiredBehavior:
same speed

In FileOutputProc (tcl8.0.3/win/tclWinChan.c,line 560) there
is ALWAYS a call to FlushFileBuffers.
So every I/O is written directly to disk.
That's why the Disk-LED is permanently blinking.

What's the reason for this call ?
Can it be removed ?

thanks
--
This is verified in 8.4a1. The disk LED does stay permanently
on under NT. Using the Performance Monitor, it does seem that
excessive flushing may be occuring.

-- 06/22/2000 hobbs

Discussion

  • Andreas Kupries

    Andreas Kupries - 2000-11-17
    • priority: 5 --> 6
     
  • David Gravereaux

    File channel driver on Win* forces a flush. It really doesn't need to, but some file tests depend on it doing a true write to disk. So therefore, it's slower.

     
  • Donal K. Fellows

    • labels: 104247 --> 27. Channel Types
     
  • Donal K. Fellows

    See also Bug #119300 - we've so many unclosed bugs that it is impractical to link related ones... <sigh>)

     
  • Andreas Kupries

    Andreas Kupries - 2001-08-23
    • assigned_to: nobody --> andreas_kupries
     
  • Andreas Kupries

    Andreas Kupries - 2001-08-23

    Logged In: YES
    user_id=75003

    The actual id is #219300 after SF did its renumbering dance.

     
  • Andreas Kupries

    Andreas Kupries - 2001-08-23

    Logged In: YES
    user_id=75003

    This is the list of tests which fail if flushing is
    disabled in the windows file driver:

    io-27.2 FlushChannel, some output buffered
    io-27.4 FlushChannel, implicit flush when buffer fills
    io-27.5 FlushChannel, implicit flush when buffer fills and
    on close
    io-29.4 Tcl_WriteChars, buffering in full buffering mode
    io-29.5 Tcl_WriteChars, buffering in line buffering mode
    io-29.6 Tcl_WriteChars, buffering in no buffering mode
    io-29.7 Tcl_Flush, full buffering
    io-29.8 Tcl_Flush, full buffering
    io-29.17 Tcl_WriteChars buffers, then Tcl_Flush flushes
    io-29.18 Tcl_WriteChars and Tcl_Flush intermixed
    io-29.19 Explicit and implicit flushes
    io-29.20 Implicit flush when buffer is full
    io-29.28 Tcl_WriteChars, lf mode
    io-39.6 Tcl_SetChannelOption, multiple options
    io-39.7 Tcl_SetChannelOption, buffering, translation
    io-39.8 Tcl_SetChannelOption, different buffering options
    io-52.7 TclCopyChannel

     
  • Andreas Kupries

    Andreas Kupries - 2001-08-23

    Logged In: YES
    user_id=75003

    Ok, I now understand the problem much better. It is
    partially an OS issue and partially an issue of how the
    affected tests were written.

    When Tcl 'flushes' a channel it actually only writes its
    internal buffers to the OS and then forgets about the data.
    The OS is free to delay the actual write to disk.

    The affected tests try to check that the flushing behaviour
    of tcl is correct. To do so they perform some writes and
    then check the size of the resulting file. But this
    meansthat they actually check the flushing behaviour of Tcl
    itself and how the OS deals with pending data when it comes
    to reporting the size of a file.

    Both Unix and Win* platforms delay writing data to disk
    until they have idle time, or by grouping nearby block
    together, etc. But obviously Win* is more lazy than Unix
    when it comes to reporting the size of a file with pending
    writes. Win* reports the size actually on disk, no matter
    how much data is pending. Unix goes to the trouble and
    calculates the size of the file as if the pending data had
    been written to the disk.

    The current solution of this problem is to force Win* to
    actually write all the data written to it by Tcl to the
    disk too, without delay. This gets us the reliable file
    sizes the tests need to perform correctly, at the expense
    of general I/O performance.

     
  • Nobody/Anonymous

    Logged In: NO

    I agree 100% with your summary.

     
  • Andreas Kupries

    Andreas Kupries - 2001-08-24

    Logged In: YES
    user_id=75003

    Just for the record here are the results of running tclbench
    for a tclsh with forced flushing (1) and without (2) for my
    machine (Win NT 5, 128 MB). Used fcopy to exercise the I/O
    system.

    $ ./tcl/win/win-dll/tclsh84.exe tclbench/runbench.tcl \ -match 'FCOPY*' -notk \ -paths "./tcl/win/win-dll/ ./tcl.nf/win/win-dll/"

    000 VERSIONS: 1:8.4a4 2:8.4a4
    001 FCOPY binary: 164K 2320137 19575
    002 FCOPY encoding: 164K 1583793 39857
    003 FCOPY std: 164K 2435353 18588
    003 BENCHMARKS 1:8.4a4 2:8.4a4

     
  • Andreas Kupries

    Andreas Kupries - 2001-08-24

    Logged In: YES
    user_id=75003

    Ideas to solve this problem collected so far.
    ________________________________________
    Just remove the forced OS flush for
    Windows. Make the tests 'unixOnly'.

    Anticipated Effects:

    - Speedup for Windows I/O compared to
    current solution.

    - No change for the other platforms.

    - The coverage of code paths by the
    testsuite decreases. In other words,
    the testsuite becomes worse.
    ________________________________________
    Add counters in the channel structures
    (on the driver side) to count how many
    bytes were read and written to the OS.

    Add testchannel subcommands to access this
    information instead of using [file size].

    The tests will have to be rewritten.

    Anticipated Effects:

    - General slowdown in the I/O system
    for all platforms (Counter management).
    Should be negligible though.

    - Speedup for Windows I/O compared to
    current solution.

    - The testsuite stays in shape.
    ________________________________________
    Handle the proposed counters only for Win*.
    Write separate tests for Unix and Win*

    Anticipated Effects:

    - Speedup for Windows.

    - No change for the other platforms.

    - The testsuite stays in shape.
    ________________________________________
    Add a boolean flag to the Win* structures
    (driver side). Indicates if a true flush
    was done on the file channel.

    Whenever a [file size] is requested the
    system goes through the list of file
    channels and does an OS flush on all with
    the flag not set. The flag is set by this
    action. Any write on the channel resets
    the flag for that channel. When closing a
    file channel do a true flush in the driver.

    The testsuite needs no change.

    Anticipated Effects:

    - Slowdown of [file size] operation
    for Win*.

    - Speedup of Win* I/O in general.

    - No change for the other platforms.

    - Essentially emulates Unix behaviour
    on Windows for Tcl.

    - Adds interaction between the
    filesystem and the I/O (channel)
    code.

    - The testsuite stays in shape.

     
  • Andreas Kupries

    Andreas Kupries - 2001-08-24

    Logged In: YES
    user_id=75003

    More ideas (coming from Jeff).
    ________________________________________
    What happens on Windows if another process
    opens the file ? Does that process also
    get the bogus file size ?
    ________________________________________
    Are there Win* APIs we could use to peek
    into the buffering done by Windows ?
    We could use this instead of the counters.

    Or we could use this in [file size] to
    report a better size.

     
  • Andreas Kupries

    Andreas Kupries - 2001-09-05

    Logged In: YES
    user_id=75003

    Ideas from David Graveraux:

    The only thing I know is that if there's uncommitted
    buffers the OS holding, a
    request for file size won't cause the OS to commit the
    buffers first.

    A look at using I/O completion ports for writing to disk
    from within Tcl >might<
    be a good work-around for tracking what the OS hasn't
    committed yet. I can't
    say for sure. The amount of code for tracking could get
    very large. Adding an
    explicit flush to the channel driver might be the best
    alternative, but explicit
    at the script level to the user instead of the implicit one
    as is now.

    That's all I know.

    >Hm. We have a flushproc in the driver, it is just not used
    yet. This
    >could contain the OS-Flush on windows and be called by
    [flush] after
    >it has committed the tcl buffers to the OS. This does not
    help with the
    >test which check file sizes to check the correctness
    the 'implicit'
    >flushes. And the moment we add the OS-flush to them we are
    back to the
    >current situation.

    half way there... add a [flush] to the tests, that will do
    FlushFileBuffers()
    or whatever was the API func...

    It's not the same. Make [flush] not only flush the channel
    but commit the OS
    buffers, too. Normal mode flushing of the channel buffer
    doesn't have to also
    mean flushing the OS buffers, too.

     
  • Andreas Kupries

    Andreas Kupries - 2001-09-06

    Logged In: YES
    user_id=75003

    Added a patch solving the problem. Used the idea of a
    boolean flag and flushing only the channels which were
    written too and only when requesting size information.

     
  • Andreas Kupries

    Andreas Kupries - 2001-09-06

    Fix, unified diff, v0

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2001-09-07

    Logged In: YES
    user_id=72656

    Looks great.

     
  • Andreas Kupries

    Andreas Kupries - 2001-09-07
    • status: open --> closed-fixed
     
  • Andreas Kupries

    Andreas Kupries - 2001-09-07

    Logged In: YES
    user_id=75003

    Committed to both head and core-8-3-1-branch.