Customizing File Types-II

Top  Previous  Next

Currently R-Studio supports two versions of file type descriptions. Version 2 extends legacy Version 1 by adding variable signature offsets and AND/OR combination of several signatures in one file type. The version of file type description is specified by the version attribute of the FileTypeList section . Version 1 is the default option.

File structure

Elements common to Versions 1 and 2 of file type description

File header

The file starts with a standard XML header

<?xml version="1.0" encoding="utf-8"?>

Section FileTypeList

<FileTypeList>

Attributes:

version

1.0

2.0

Optional

Version of file type description

Default: 1.0

It requires a closing element <FileTypeList>.

Comments

<!-- Comment string -->

An XML-standard string for a comment.

Version 1 of file type description

Signature file example

  <FileTypeList>

    <FileType id="2" group="archive" description="ARJ Archive" extension="arj">

      <Signature offset="3" count="1">Abc\x5c\x00\x04</Signature>

      <Signature offset="9" count="2">\x23\x01\xf4</Signature>

    </FileType>

  </FileTypeList>

Section FileType

This is a description of each file signature.

Attributes:

id

<u32>

Required

Digital file type identifier. Should be unique for each file type.


group

<string>

Optional

Specifies a file type group in which found files will appear. You may specify either your own groups or those predefined on the File Types dialog box. See the table below.

Default: unknown

description

<string>

Optional

Brief file description

Default: null (no description)

features

NO_SCAN

TXT_ANSI

TXT_UNICODE

Optional

Additional properties of the file type. If you want to specify several properties, they should be separated by a space.

Default: 0

extension

<string>

Optional

File extension.

Default: null (no extension)

File type properties flags

NO_SCAN

Not to be scanned for. If this flag is used, R-Studio will not search for such file type. Such files will be shown when sorting files by their extensions.

TXT_ANSI

The file can be viewed as ANSI text. If this flag is specified, the file can be correctly represented as an ANSI text. When previewing, this file will be immediately sent to Text/hexadecimal editor.

TXT_UNICODE

The file can be viewed as UNICODE text. If this flag is specified, the file can be correctly represented as a UNICODE text. When previewing, this file will be immediately sent to Text/hexadecimal editor.

List of predefined file type groups

Group

Name on the File Types dialog box.

archive

Archive Files

graphics

Graphics/Picture

internet

Internet-related files

multimedia

Multimedia Files

audio

Multimedia: Audio Files

video

Multimedia: Video Files

font

Font

document

Document

doc_database

Document: Database

doc_sheet

Document: Spreadsheet

exe

Executable/Library/DLL

unknown

Other file types

This section can contain an unlimited number of the Signature elements.If there are several Signature elements, that means that all those signatures are simultaneously present in the file. Such signatures should have different offset attributes and they should not overlap.

Element Signature

The element contains a string value of the file signature consisting of ASCII characters and hex bytes in the \xhh format, where hh is a hexadecimal byte code. If that is not a hexadecimal number after \x, \x are treated as a part of the string section of the signature

Attributes:

offset

<u16>

Optional

Decimal offset for the signature

Default: 0

count

<u16>

Optional

Decimal number specifying the number of signatures of the same length. Used when several signatures of the same length starting with the same offset can be present in a file. In this case they should be sequentially written in the element, and the size attribute specifies the length of signature. count*size should be equal to the number of bytes in the element.

If only one signature can be on this offset, count should be equal to "1", and size should be equal to the length (the number of bytes) of the signature.

Default: 1

size

<u16>

Optional

Decimal number specifying the number of bytes in the signature.

Default: the number of bytes written in the element.

from

begin

end

Optional

Specifies from where the offset is calculated.

If end, the offset is from the end of file to the first byte of the signature. That is, if the signature is two bytes long, the offset value should be 2.

Default: begin

Version 2 of file type description

Signature file example

<?xml version="1.0" encoding="utf-8"?>

<FileTypeList version="2.0">

  <FileType id="5626" group="_Test" description="Test file" extension="tst">

    <Begin combine="and">

      <Signature from="0" to="20">ABC</Signature>

      <Signature offset="1">CDEFG</Signature>

      <AND>

        <Signature offset="0">DE</Signature>

        <Signature offset="0">RTD</Signature>

        <OR>

          <Signature offset="12">CP</Signature>

          <Signature offset="16">RTD</Signature>

        </OR>

      </AND>

    </Begin>

    <End combine="or">

      <Signature from="3" to="20">ABC</Signature>

      <Signature offset="5">CDEFG</Signature>

      <AND>

        <Signature offset="2">DE</Signature>

        <Signature offset="3">RTD</Signature>

        <OR>

          <Signature offset="12">CP</Signature>

          <Signature offset="16">RTD</Signature>

        </OR>

      </AND>

    </End>

  </FileType>

</FileTypeList>

Section FileType

This is a description of each file signature.

Attributes:

Similar to those in Version 1.

The section can contain one element Begin and one End. It should contain at least one of them.

Example

 <FileTypeList version=”2.0”>

    <FileType id="2" group="archive" description="ARJ Archive" extension="arj">

      <Begin [attributes]>

        ...

      </Begin>

      <End [attributes]>

        ...

      </End>

    </FileType>

  </FileTypeList>

Sections Begin and End

Specify the positions of  file type signatures in the file.

Attributes

combine

and

or

Optional

Shows the order of the logical operation (union or intersection)

Default: and

These sections can contain one of several elements Signature. And one or several elements OR or AND. If there are several elements inside the section they are combined according to the attribute combine.

Example:

  <FileTypeList version=”2.0”>

    <FileType id="2" group="archive" description="ARJ Archive" extension="arj">

      <Begin combine="or">

        <Signature [attributes]> ... </Signature>

        ...

        <Signature [attributes]> ... </Signature>

        <AND>

          ...

        </AND>

        <OR>

          ...

        </OR>

      </Begin>

      <End>

        <OR>

          ...

        </OR>

        <Signature [attributes]> ... </Signature>

        ...

        <Signature [attributes]> ... </Signature>

      </End>

    </FileType>

  </FileTypeList>

Sections AND and OR

These sections can contain one of several elements Signature. And one or several elements OR or AND. If there are several elements inside the section they are combined according to the section type (logical AND or OR).

Example:

  <FileTypeList version=”2.0”>

    <FileType id="2" group="archive" description="ARJ Archive" extension="arj">

      <Begin>

        <Signature [attributes]> ... </Signature>

        ...

        <Signature [attributes]> ... </Signature>

        <AND>

          <Signature [attributes]> ... </Signature>

          <OR>

            <Signature [attributes]> ... </Signature>

            <AND>

              <Signature [attributes]> ... </Signature>

              <Signature [attributes]> ... </Signature>

            </AND>

            <OR>

              <Signature [attributes]> ... </Signature>

              <Signature [attributes]> ... </Signature>

            </OR>

          </OR>

          <Signature [attributes]> ... </Signature>

        </AND>

      </Begin>

    </FileType>

  </FileTypeList>

Element Signature

The element contains a string value of the file signature consisting of ASCII characters and hex bytes in the \xhh format, where hh is a hexadecimal byte code. If that is not a hexadecimal number after \x, \x are treated as a part of the string section of the signature

Attributes:

offset

<u16>

Optional

Decimal offset for the signature

Default: 0

from

<u16>

Optional

Decimal number specifying the leftmost possible offset for the file signature.

Ignored if the offset attribute is specified.

Default: undefined

to

<u16>

Optional

Decimal number specifying the rightmost possible offset for the file signature.

Ignored if the offset attribute is specified.

Default: undefined

size

<u16>

Optional

Decimal number specifying the number of bytes in the signature.

Default: the number of bytes written in the element.

Example:

  <FileTypeList version=”2.0”>

    <FileType id="2" group="archive" description="ARJ Archive" extension="arj">

      <Begin>

        <Signature offset="3">Abc\x5c\x00\x04</Signature>

        <Signature from="9" to="15">\x23\x01\xf4</Signature>

      </Begin>

    </FileType>

  </FileTypeList>