Exploring Steganography
Last updated
Last updated
The root “steganos” is Greek for “hidden” or “covered", and the root “graph” is Greek for “to write.” Put these words together, and you’ve got something close to “hidden writing” or “secret writing”.
Steganography is the technique of hiding secret data within an ordinary, non-secret file or message to avoid detection. Examples of steganography include embedding a secret piece of text inside a picture or hiding a secret message or script inside a Word or Excel document.
A ‘trojan horse’ is one form of steganography; a term that originated from the wooden horse used by the Greeks during the Trojan war. The Greeks gained secret entry into Troy by pretending the horse was a gift to be taken inside the city gates. However, a group of warriors was hiding inside the structure, which then emerged in the evening and opened the gates to the Greek army – malicious elements hiding within a harmless medium. The oldest reported case of a trojan horse goes back to 500 BC, when the leader of Miletus, Histiaeus, tattooed a message on the shaved head of a slave and let the hair grow back. Histiaeus then sent the slave to his son-in-law Aristagoras, who shaved the slave's head and uncovered the message.
Steganography's goal is to conceal and deceive. It can employ any medium to disguise covert communications, as opposed to cryptography, which involves scrambling data or utilizing a key. Whereas cryptography is primarily concerned with privacy, steganography is a technique that allows for both secrecy and deception.
There are many different methods of steganography. Here are some examples:
Polyglots, in a security context, are files in a valid format that combine multiple different file types. For example, a GIFAR is both a GIF and a RAR file.
Polyglot files are often used to bypass file-specific safeguards. Many applications that allow users to upload files only accept certain types, such as JPEG, GIF, and DOC, to prevent the upload of potentially dangerous files like JS, PHP, or Phar.
Polyglots are formed when they pass all the validity checks for more than one file type. To an application looking for image files, a polyglot could bear all the signatures of an image file while resembling something else to another application looking for different indicators, checking the boxes of both.
One example of a polyglot file is a Phar-JPEG file. This has to pass all the JPEG validity checks when it's uploaded but still function as a PHP archive when the attacker calls on it to start a PHP object injection attack. Similarly, by abusing the way browsers load content, something like a GIF image file can be created that contains malicious JavaScript code. If the browser is told to treat the file as an image, it will load the image. If it's told that the file is JavaScript, however, the browser will load it as JavaScript and execute the malicious code.
The least significant bit (LSB) refers to the process of replacing the least significant bit of the bytes that create a container file with the bits that form the data we want to hide.
Digital images can be described as a finite set of digital values, called pixels. Pixels are the smallest individual element of an image, holding values that represent the brightness of a given color at any specific point. We can think of an image as a matrix (or a two-dimensional array) of pixels that contains a fixed number of rows and columns.
LSB is a technique in which the last bit of each pixel is modified and replaced with the secret message’s data bit.
From the image above, it's clear that changing the most significant bit (MSB) will have a larger impact on the final value. But if we change the LSB, the impact on the final value will be minimal. That's why LSB is used for steganography.
Steghide is a steganography program capable of hiding data in a variety of image and audio files with a passphrase using the command line. Features include compression and encryption of the embedded data, and automatic integrity checking using a checksum. The JPEG, BMP, WAV, and AU file formats are supported for use as cover files. There are no restrictions on the format of the secret data.
Similarly, ExifTool is a command line tool used to read, write, and manipulate the metadata of a wide range of different file types. Image metadata often contains different tags describing the information about the image, such as date, copyright, and location. ExifTool can be used to write and manipulate tags in image metadata, as well as identify the original type of a file masquerading as another.
Steganalysis is the process of detecting the presence of steganography. By identifying the existence of a hidden message, we can try and find the tools used to hide it. If we then locate the tool, we can use it to extract the original message.
Detection can foil the very purpose of steganography, even if the secret message is not extracted. This is because detecting the existence of hidden data is enough to destroy it. Detection is generally carried out by identifying some characteristic feature of images altered by the hidden data.
Secret data may be concealed as simply as hiding text behind other text. In Microsoft Word, text boxes can be placed directly on top of each other and formatted in a way to render the text undetectable by an observer. Images can be hidden behind other images too, using the layers feature of some photo editing tools, such as Photoshop.
One common technique to identify whether data has been hidden is to check the hash of a file and compare it with an original, like a company logo, for example. If data has been embedded, the file structure will have changed, which means the hash will be very different from the original file.
In the case of audio steganography, we can scan for high and inaudible frequencies for information and distortions or patterns to find differences in pitch, echo, or background noises to detect a secret message.
Next I used Exitool and steghide to extract information from images.
When using the Exiftool I found the token in image description - P0Lyn35
Now using Steghide to extract information from image2 – password provided was “carnivale”
Finally using the information I found from image2.jpg I could use it and steghide to access image3.jpg and the final token.