> For the complete documentation index, see [llms.txt](https://f1rstbyt3.gitbook.io/hacking-notes/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://f1rstbyt3.gitbook.io/hacking-notes/dfir/forensics/forensics-tooling/bulkextractor-and-bulkextractor-rec.md).

# BulkExtractor & BulkExtractor-Rec

### Additional plugins&#x20;

{% embed url="<https://github.com/4n6ist/bulk_extractor-rec>" %}

Additionally can parse:&#x20;

* EVTX - Carves windows Evtx logs
* NTFSINDX - INDX records of $INDEX\_ALLOCATION
* NTFSLOG - TSTR/TCRD records of [NTFS](/hacking-notes/dfir/forensics/windows/microsoft-forensics/ntfs.md#usdlogfile)
* NTFSMFT - index of [NTFS](/hacking-notes/dfir/forensics/windows/microsoft-forensics/ntfs.md#usdmft)
* NTFSUN - USN\_RECORD structure of [NTFS](/hacking-notes/dfir/forensics/windows/microsoft-forensics/ntfs.md#usdextend-usdusnjrnl) and [NTFS](/hacking-notes/dfir/forensics/windows/microsoft-forensics/ntfs.md#usdj)
* UTMP - UTMP Structure records (for unix)

Bulk-extractor is not filesystem-aware, ( by design) so when we want to narrow our focus to unallocated space, we need to use different tooling such as blks by sleuth kit,&#x20;

Blks

```bash
-s # extract slack 
blks [options] <image> > image.unallocated
blks -s <image> > image.slack
```

### Basic Usage

#### Running bulk\_extractor

To run `bulk_extractor` on a disk image or directory, use the following command:

<pre class="language-bash"><code class="lang-bash"><strong>bulk_extractor -o &#x3C;output_directory> &#x3C;input_file_or_directory>
</strong></code></pre>

* `<output_directory>`: Directory where the output will be stored.
* `<input_file_or_directory>`: The disk image file or directory you want to analyze.

### Advanced Options

#### Specifying Scanners

`bulk_extractor` includes various scanners for extracting different types of information. You can enable or disable scanners using the `-e` (enable) or `-x` (disable) options.

**Enabling Specific Scanners**

```
bulk_extractor -o output -E <scanner_name> <input_file>
```

**Disabling Specific Scanners**

```bash
bulk_extractor -o output -X <scanner_name> <input_file>
```

Enable only the `email` scanner:

```
bulk_extractor -o output -E email forensic_image.dd
```

Disable the `ccn` (credit card number) scanner:

```
bulk_extractor -o output -X ccn forensic_image.dd
```

#### Setting Sector Size

If you need to set a specific sector size for the input file, use the `-S` option:

```
bulk_extractor -o output -S <sector_size> <input_file>
bulk_extractor -o output -S 512 forensic_image.dd
```

### Analyzing Results

After running `bulk_extractor`, the output directory will contain several files. The most important ones include:

* **Feature files**: Contain extracted information such as email addresses, URLs, and credit card numbers.
* **Histogram files**: Provide a summary of the occurrences of different features.
* **Report files**: Summarize the findings and provide an overview of the analysis.

#### Viewing Feature Files

Feature files are named after the type of data they contain (e.g., `email.txt`, `url.txt`). Open these files with any text editor to review the extracted information.

#### Example

```
bash
```

```bash
cat output/email.txt
```

#### Viewing Histograms

Histograms provide a statistical overview of the data. For example, the `email_histogram.txt` file shows the frequency of each extracted email address.

#### Example

```
bash
```

```bash
cat output/email_histogram.txt
```

{% embed url="<https://github.com/simsong/bulk_extractor>" %}


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://f1rstbyt3.gitbook.io/hacking-notes/dfir/forensics/forensics-tooling/bulkextractor-and-bulkextractor-rec.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
