NABEEL.

THE ART OF ARK

A Comprehensive Masterclass in Semantic Web Extraction (v3.1)


0. THE GENESIS: Why ARK?

Most browser tools see a webpage as a wall of code. They rely on "Divs," "Spans," and "Classes" that change every week. ARK (Automated Result Kernel) is different. It is a Semantic Engine.

ARK doesn't look for code; it looks for concepts. It knows what an email looks like, what a price looks like, and what a "search result" container looks like. This guide will teach you how to talk to ARK.


CHAPTER 1: The Magic Command (/>)

Magic Mode is the heart of ArkFind. It uses a custom-built heuristic engine to identify data without needing CSS selectors.

1.1 The Semantic Keywords

ARK comes pre-programmed with high-precision regex patterns for everyday data:

  • email / emails: Matches standard email formats.
  • phone / phones: Matches US and International phone numbers.
  • price / prices: Matches currency symbols ($ € £ ₹) followed by numbers.
  • date / dates: Matches YYYY-MM-DD, DD/MM/YYYY, and US formats.
  • time / times: Matches 12h and 24h clock formats.
  • color / colors: Matches HEX, RGB, RGBA, HSL, and HSLA codes.
  • number / numbers: Matches integers and floats.
  • address / addresses: Matches common street address patterns.

1.2 Singular vs. Plural Logic

ARK interprets grammar to define the result set size:

  • Singular (email): Finds only the first match inside each container.
  • Plural (emails): Finds every match inside each container (List Mode).

CHAPTER 2: Structural Scoping (The Row Logic)

The true power of ARK is its ability to turn a messy list into a clean spreadsheet. We do this using the Colon Operator (:).

Syntax: /> [CONTAINER]: [COLUMN1] [COLUMN2]

Example: Scraping Amazon or eBay Suppose you are on a search page. Every result is inside a box with the class .s-item. /> .s-item: title price link

What happens under the hood?

  1. Scope: ARK finds every box matching .s-item.
  2. Mapping: Inside each box, it looks for the title text, the price regex, and the first link.
  3. Result: You get a clean table where every row is one product.

CHAPTER 3: Advanced Selector Grammar

3.1 Attribute Extraction (@)

If data is hidden inside HTML attributes (like an image URL or a button's value), use @. /> .profile: img@src a@href input@value

3.2 List Mode (*)

You can force List Mode on any custom selector by adding *. /> .article: .tags* (This gets all elements matching .tags inside the article)

3.3 Text Filtering ("")

Only pick elements that contain specific text strings. /> button"Add to Cart"

3.4 Fallback Logic (||)

Modern sites use A/B testing. One page might use .price-tag, another might use .cost. /> .item: .price-tag||.cost

3.5 Requirement Enforcement (!)

Skip incomplete data. If a row doesn't have a price, throw the whole row away. /> .result: title price!


CHAPTER 4: Strict Mode (/>>)

Sometimes ARK's "Magic" heuristics get in the way. Use Strict Mode to disable all guessing.

Syntax: />> [SCOPE] >> ([COL1_SELECTOR]) ([COL2_SELECTOR])

Example: />> article.post >> (h2 > span.title) (div.content a@href)


CHAPTER 5: The Template Engine (|)

Extraction is only half the battle. The Template Engine (the Pipe |) allows you to reshape your data into Markdown, CSV, or custom strings.

5.1 Variables

Use $1, $2, $3 to refer to the data columns you extracted.

5.2 List Slicing

If a column contains a list (e.g., tags), slice it using []:

  • $1[0]: The first item only.
  • $1[-1]: The last item only.
  • $1[0:3]: The first three items.

5.3 Mapping ({})

Apply a wrapper to every item in a list. Use %1 as the placeholder. $1{**%1**} turns [Tag1, Tag2] into [**Tag1**, **Tag2**].

5.4 Joining ([])

Define the separator for your list items. $1[, ] joins items with a comma and space.

THE MASTER EXAMPLE: Query: /> .article: title tags* | # $1 Tags: $2{**%1**}[ | ] Result: # Article Title Tags: **News** | **Tech** | **Web**


CHAPTER 6: Visual Mastery

6.1 X-Ray Mode (Alt+X)

An independent overlay system that reveals the Tag, ID, and pixel dimensions of any element without touching the site's layout.

6.2 Freeze Mode (Alt+Z)

The "Secret Sauce." It locks the page in its current state. Hover menus won't close, and tooltips won't disappear. This is essential for selecting elements that only exist temporarily.

6.3 The Visual Builder

A point-and-click interface. Select a container, select your data, and click "Save" to generate the ARK query automatically. Use Drill-Down (Shift+Click) to select elements buried behind other layers.


APPENDIX: The AI Architect Prompt

Copy this into ChatGPT, Claude, or Gemini to turn it into an ARK query expert.


SYSTEM PROMPT: ARK EXTRACTION ARCHITECT

You are an expert in the **ARK Extraction Language (v3.1)**. 

### GRAMMAR
- Triggers: `/>` (Smart), `/>>` (Strict).
- Syntax: `TRIGGER SCOPE: COL1 COL2 ... | TEMPLATE`
- Logic: `*` (List), `@` (Attr), `||` (OR), `!` (Must), `""` (Filter).
- Template: `$N` (Var), `[start:end]` (Slice), `{pattern %1}` (Map), `[joiner]` (Join).

### KEYWORDS
`email`, `phone`, `price`, `date`, `time`, `color`, `number`, `address`, `link`, `img`, `title`, `text`.

### TASK
Convert user requests into a single ARK query code block.