Fg-selective-arabic.bin «QUICK · CHECKLIST»

# Example usage prompt = "اكتب مقالًا قصيرًا عن تأثير الذكاء الاصطناعي على التعليم في العالم العربي" print(generate_arabic(prompt)) from fastapi import FastAPI, Request from pydantic import BaseModel

# 2️⃣ Install core dependencies pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu124 pip install transformers==4.44.0 sentencepiece tqdm accelerate # Replace <TOKEN> with the access token you received after agreeing to the license wget -O fg-selective-arabic.bin "https://huggingface.co/fg-consortium/fg-selective-arabic/resolve/main/fg-selective-arabic.bin?download=true&token=<TOKEN>" Tip: The file is ~6 GB compressed ( .bin.gz ). Use pigz -d for faster decompression on multi‑core CPUs. 5.3 Loading the Model from transformers import AutoModelForCausalLM, AutoTokenizer import torch

One of the most noteworthy contributions to the Arabic NLP community in 2025 is the checkpoint—a compact, fine‑tuned binary released by the Focal‑Gating (FG) research consortium . This article unpacks everything a practitioner, researcher, or hobbyist needs to know about this file: its origins, internals, practical deployment, performance, and the broader implications for Arabic AI. 2. What Is “Fg‑selective‑arabic.bin”? | Attribute | Description | |-----------|-------------| | File type | Serialized PyTorch checkpoint ( .bin ) | | Model family | Focal‑Gating (FG) Transformer, 1.3 B parameters | | Training regime | Selective fine‑tuning on a curated Arabic corpus (≈ 200 B tokens) | | Primary purpose | High‑quality Arabic text generation, summarization, and instruction following | | Target hardware | GPU‑accelerated inference (≥ 8 GB VRAM) and optional CPU‑only inference via GGUF conversion | | License | Apache 2.0 with a “non‑commercial‑use” addendum (see Section 10) | | Release date | 3 March 2025 (v1.0) | | Version | v1.0‑selective‑2025‑03 (semantic versioning) | Fg-selective-arabic.bin

uvicorn main:app --host 0.0.0.0 --port 8000 --workers 2 Now you have a ready for internal tools, chat‑bots, or research pipelines. 6. Performance Benchmarks & Comparative Evaluation | Metric | Fg-selective-arabic.bin | GPT‑4‑Turbo (Arabic) | LLaMA‑2‑13B‑Arabic | MPT‑7B‑Arabic | |--------|---------------------------|---------------------|-------------------|---------------| | Perplexity (MSA) | 13.7 | 13.9 | 16.4 | 19.1 | | BLEU (Summarization) | 35.2 | 34.8 | 30.7 | 28.3 | | ROUGE‑L (QA) | 48.5 | 48.1 | 44.0 | 41.6 | | Inference Latency (RTX 4090, 1‑token) | 9 ms | 12 ms | 13 ms | 15 ms | | VRAM Footprint (FP16) | 7.8 GB | 9.2 GB | 9.8 GB | 8.6 GB | | Dialectal Accuracy (Egyptian) | 92 % | 90 % | 84 % | 80 % |

app = FastAPI(title="FG‑Arabic Generation API") # Example usage prompt = "اكتب مقالًا قصيرًا

class GenerationRequest(BaseModel): prompt: str max_new_tokens: int = 150 temperature: float = 0.8 top_p: float = 0.95

@app.post("/generate") async def generate(req: GenerationRequest): text = generate_arabic( req.prompt, max_new_tokens=req.max_new_tokens, temperature=req.temperature, top_p=req.top_p ) return "generated_text": text Run with: top_p=0.95): inputs = tokenizer(prompt

# Load with `torch_dtype` set for mixed‑precision model = AutoModelForCausalLM.from_pretrained( model_path, device_map="auto", torch_dtype=torch.bfloat16, # use bfloat16 on Ampere+ GPUs trust_remote_code=True ) model.eval() def generate_arabic(prompt, max_new_tokens=150, temperature=0.8, top_p=0.95): inputs = tokenizer(prompt, return_tensors="pt").to(model.device) with torch.no_grad(): output = model.generate( **inputs, max_new_tokens=max_new_tokens, temperature=temperature, top_p=top_p, do_sample=True, pad_token_id=tokenizer.eos_token_id ) return tokenizer.decode(output[0], skip_special_tokens=True)

model_path = "fg-selective-arabic.bin" tokenizer = AutoTokenizer.from_pretrained("fg-consortium/fg-selective-arabic", trust_remote_code=True)

nach oben