2bitBrain: Splitting up Kongsberg Watercolumn files

Thursday, July 5, 2012

Splitting up Kongsberg Watercolumn files

Sometimes you just forget to save Kongsberg multibeam water column data into a separate file. This just happened to me on a cruise and I found that the >2GB file sizes made my 32-bit software puke. Luckily, the file sizes didn't exceed 4GB so I decided to write a file splitter in python the pulls apart the original .all file and outputs a new .all file purged of water column datagrams AND a separate .wcd file. Here's my first cut at it, it's a script that you feed a list of filenames and it creates a "split" subdirectory for each file and writes a split .all/.wcd combination into the split subdirectory, for example, the file:

20120702/0003_20120702_122222_FK_EM710.all

will split into:

20120702/split/0003_20120702_122222_FK_EM710.all
20120702/split/0003_20120702_122222_FK_EM710.wcd

Give it a try, I hope it doesn't nuke your data.

#!/usr/bin/env python2.6

import os
import struct
import time
import sys

file_count=0
debug=False

dir="split"

for filename in sys.argv:

file_count += 1

if (file_count == 1):
# I'm too lazy to parse command line args so just skipping over the
# script name (which is arg zero in the list)
continue

file = open(filename, 'rb')
filesize = os.path.getsize(filename)

# What is the path to the input file without the filename?
filepath=os.path.dirname(filename)
fileprefix=os.path.basename(filename)
if debug or True:
print "Doing file",filename
print "Got file path",filepath
print "Got file basename",fileprefix

# Join the file's directory path with the usual output subdirectory name
outdir=os.path.join(filepath,dir)

if not os.path.exists(outdir):
os.makedirs(outdir)

split_allname = os.path.join(outdir, fileprefix)
split_wcdname = split_allname.replace(".all",".wcd")

if debug or True:
print filename, "will split into", split_allname, split_wcdname

if not os.path.exists(split_allname):
split_allfile = open(split_allname,"wb")
split_wcdfile = open(split_wcdname,"wb")
else:
print "Skipping", filename, "since it's already split!"
file.close()
continue

last_percent = 0
while True:

# Make sure we don't try to read beyond the EOF
if (file.tell() + 6 > filesize):
break

line = file.read(6)

header = struct.unpack("

rawlength=line[0:3]
length = header[0]
stx = header[1]
id = header[2]

if (stx != 2):
if debug:
print 'STX not found, trying next datagram at position',file.tell()-5
file.seek(-5,1)
continue

if debug:
print 'STX found, going to try for ETX now'

# Make sure we don't try to read beyond the EOF
if (file.tell() + (length-5) > filesize):
file.seek(-5,1)
continue

file.seek(length-5,1)

# Make sure we don't try to read beyond the EOF
if (file.tell() + 3 > filesize):
break

line = file.read(3)

footer = struct.unpack("

etx = footer[0]

checksum = footer[1]

if (etx != 3):

if debug:

print 'ETX not found, trying next datagram at position',file.tell()-(length+3)

file.seek(-(length+3),1)

continue

# Rewind to very beginning of the datagram, including the length field

file.seek(-(length+4),1)

data = file.read(length+4)

if debug:

print "Got id", id, "and length", length

if (id == 0x49 or id == 0x69 or id == 0x52 or id == 0x55):

# Stuff for both files

split_allfile.write(data)

split_wcdfile.write(data)

elif (id == 0x6B):

# Just for the watercolumn file

split_wcdfile.write(data)

else:

# Everything else goes into the raw file

split_allfile.write(data)

percent=int(100.0 * file.tell()/filesize)

if (percent%5 == 0 and percent != last_percent):

print percent, "% done, ALL:",split_allfile.tell()," WCD:",split_wcdfile.tell()

last_percent = percent

if file.tell() >= filesize:

break

file.close()

split_allfile.close()

split_wcdfile.close()

print 'All done!'

7 comments:

RicardoSeptember 20, 2012 at 12:52 PM
It happened to us we stored water column data and that's killing my HD. I tried your script but it gives me a couple of errors: first I don't have python 2.6 so I tried with env 2.7. The next set of errors perhaps have to do with the change of version. That is the interpreter cannot deal with header/footer = struct.unpack(":

header = struct.unpack("
^
SyntaxError: EOL while scanning string literal

Is this a true syntax error or is it a matter of python version. Note that I'm not a python programmer, hence perhaps my question is silly.

Thanks for the post!
ReplyDelete
Replies
RicardoNovember 22, 2012 at 2:25 AM
Hi:

I made some changes to the previous script. Now it works as expected. Please, comment.

#!/usr/bin/env python
# -*- coding: utf-8 -*-

# This python script strips .all EM710 files into wcd (water column data) and .all bottom info. # Adapted from http://2bitbrain.blogspot.com.es/2012/07/splitting-up-kongsberg-watercolumn.html

import os
import struct
import time
import sys

file_count=0
debug=False

dir="split"

for filename in sys.argv:

file_count += 1

if (file_count == 1):
# I'm too lazy to parse command line args so just skipping over the
# script name (which is arg zero in the list)
continue

file = open(filename, 'rb')
filesize = os.path.getsize(filename)

# What is the path to the input file without the filename?
filepath=os.path.dirname(filename)
fileprefix=os.path.basename(filename)
if debug or True:
print "Doing file",filename
print "Got file path",filepath
print "Got file basename",fileprefix

# Join the file's directory path with the usual output subdirectory name
outdir=os.path.join(filepath,dir)

if not os.path.exists(outdir):
os.makedirs(outdir)

split_allname = os.path.join(outdir, fileprefix)
split_wcdname = split_allname.replace(".all",".wcd")

if debug or True:
print filename, "will split into", split_allname, split_wcdname

# if not os.path.exists(split_allname):
split_allfile = open(split_allname,"wb")
split_wcdfile = open(split_wcdname,"wb")
# else:
# print "Skipping", filename, "since it's already split!"
# file.close()
# continue

last_percent = 0
while True:

# Make sure we don't try to read beyond the EOF
if (file.tell() + 6 > filesize):
break

line = file.read(6)

header = struct.unpack(' filesize):
file.seek(-5,1)
continue

file.seek(length-5,1)

# Make sure we don't try to read beyond the EOF
if (file.tell() + 3 > filesize):
break

line = file.read(3)
footer = struct.unpack("= filesize:
break

file.close()
split_allfile.close()
split_wcdfile.close()

print 'All done!'
ReplyDelete
Replies
Christian FerreiraNovember 27, 2012 at 11:21 AM
How do I actiually run this script? Any hints? I'm not a python user.
Thanks in advance, and it will be very useful for me. :-)
ReplyDelete
Replies

Add comment

2bitBrain

Search This Blog

Thursday, July 5, 2012

Splitting up Kongsberg Watercolumn files

7 comments:

Followers

Blog Archive

About Me