Glen Turner (vk5tu) wrote,

Unix holey files

Unix has sparse files. If you write a byte at a seek()ed location to a file then all unwritten bytes prior to that seek()ed-and-write()n byte have value zero when read. Those zeroed bytes take no storage space on the disk (although the accounting for the storage does take some space). You can think of the file as having a "hole".

Sparse files are useful for network testing, as they allow the performance of the storage and I/O hardware to be taken out of the test, leaving the performance of the operating system and the network.

Sparse files for testing are conveniently created using dd(1). For example, to create a 10GiB test file named ‘test-10gibyte.bin’:

$ dd if=/dev/zero of=test-10gibyte.bin bs=1 count=1 seek=$(( (10 * 1024 * 1024 * 1024) - 1))

and to create a 10GB file named ‘test-10gbyte.bin’:

$ dd if=/dev/zero of=test-10gbyte.bin bs=1 count=1 seek=$(( (10 * 1000 * 1000 * 1000) - 1))
Aside: Units for networking and units for RAM

Networking uses SI units for bandwidth, due to the close relationship of bandwidth with signalling frequencies, measured in SI's Hertz. The error between (103)n and (210)n increases with n; becoming concerning when n=3 (GB versus GiB); and being unsustainably large when n≥4 (TB versus TiB).

Networking also uses bits as the basic unit rather than bytes, again due to the closer relationship of bits to signalling frequencies. In networking there are 8 bits per byte. Care is taken to distinguish Gbps (gigabits per second) and GBps (gigabytes per second) due to the eight-fold difference. Incorrect casing of the ‘b’ leads to exasperated coworkers.

Tags: linux
  • Post a new comment

    Error

    default userpic

    Your reply will be screened

    Your IP address will be recorded 

  • 1 comment