Pipeline

End-to-end pipeline for processing IOM evaluation reports

The Report class orchestrates the full evaluation pipeline:

  1. Download - Fetch PDFs from IOM evaluation repository
  2. OCR - Convert PDFs to markdown with heading hierarchy
  3. Extract - Pull core sections from converted documents
  4. Map - Identify which standardized framework themes are relevant to each report, including the Strategic Results Framework (SRF) (Enablers, Cross-cutting Priorities, and Outputs) and Global Compact for Migration objectives

Report class

The Report class wraps an Evaluation and provides methods for the four pipeline stages: download → ocr → extract → map


Report


def Report(
    ev:Evaluation, # The evaluation metadata object
    pdf_url:str=None, # Optional direct URL to PDF
    results_path:str='data/results', # Path to save/load results
):

An evaluation report with full pipeline support

Exported source
class Report:
    "An evaluation report with full pipeline support"
    def __init__(self,
                 ev:Evaluation,                   # The evaluation metadata object
                 pdf_url:str=None,                # Optional direct URL to PDF
                 results_path:str='data/results'  # Path to save/load results
                ):
        store_attr('ev,pdf_url,results_path')
        self.id = ev.id
        self.pdf_path,self.md_path,self.sections,self.mappings = None,None,None,{}
        self._load_existing()
    
    def _load_existing(self):
        "Load state from saved JSON if it exists"
        p = Path(self.results_path)/f'{self.id}.json'
        if not p.exists(): return
        data = json.loads(p.read_text())
        self.sections,self.mappings = data.get('sections'),data.get('mappings', {})
        if data.get('pdf_path'): self.pdf_path = Path(data['pdf_path'])
        if data.get('md_path'): self.md_path = Path(data['md_path'])
    
    @classmethod
    def from_url(cls,
                 url:str,                         # URL of the evaluation PDF
                 evals:list,                      # List of `Evaluation` objects to search
                 results_path:str='data/results'  # Path to save/load results
                 ):                               # Report initialized from URL
        return cls(find_eval(evals, url, by='url'), pdf_url=url, results_path=results_path)

    @classmethod
    def from_title(cls,
                   title:str,                      # Title to search for
                   evals:list,                     # List of `Evaluation` objects to search
                   results_path:str='data/results' # Path to save/load results
                   ):                              # Report initialized from title
        return cls(find_eval(evals, title, by='title'), results_path=results_path)
Exported source
@patch
def _repr_markdown_(self:Report) -> str:  # Markdown formatted report summary
    "Display report metadata and processing status in Jupyter notebooks"
    "Display report metadata and processing status in Jupyter notebooks"
    title = self.ev.meta.get('Title', 'Untitled')
    year = self.ev.meta.get('Year', 'n/a')
    org = self.ev.meta.get('Evaluation Commissioner', 'Unknown')
    
    status = []
    if self.pdf_path: status.append(f'✓ PDF downloaded')
    if self.md_path: status.append(f'✓ Markdown converted')
    if self.sections: status.append(f'✓ Sections extracted (~{n_tokens(self.sections)} tokens)')
    if self.mappings:
        mapped = ', '.join(self.mappings.keys())
        status.append(f'✓ Mappings: {mapped}')
    status_str = ' | '.join(status) if status else 'Not processed'
    
    return f"""
## Report: {title}
**Year:** {year} | **Organization:** {org}  
**ID:** `{self.id}`

**Processing Status:**  
{status_str}

**Documents:** {len(self.ev.docs)} available
"""

Creating reports

Create a report from a URL:

evals = load_evals('files/test/evaluations.json')
url = "https://evaluation.iom.int/sites/g/files/tmzbdl151/files/docs/resources/Abridged%20Evaluation%20Report_%20Final_Olta%20NDOJA.pdf"
report = Report.from_url(url, evals, results_path='files/test/results')
report

Report: Final Evaluation of the EU-IOM Joint Initiative for migrant protection and reintegration in the horn of Africa

Year: 2023 | Organization: IOM
ID: 49d2fba781b6a7c0d94577479636ee6f

Processing Status:
✓ PDF downloaded | ✓ Markdown converted | ✓ Sections extracted (~227 tokens) | ✓ Mappings: enablers, ccps, gcm, outputs, enbs, gcms, outs

Documents: 5 available

Or from a title:

title = 'Evaluation of IOM Accountability to Affected Populations'
report = Report.from_title(title, evals, results_path='files/test/results')
report

Report: Evaluation of IOM Accountability to Affected Populations

Year: 2025 | Organization: IOM
ID: 6c3c2cf3fa479112967612b0baddab72

Processing Status:
✓ PDF downloaded | ✓ Markdown converted | ✓ Sections extracted (~17054 tokens)

Documents: 4 available

Note

When creating a report from title (rather than URL), if multiple PDFs are available for an evaluation, the ocr() and downstream methods will process the first PDF found in the download directory.

Persistence

Reports automatically save after each pipeline stage. Use load_report to resume from any checkpoint.


Report.save


def save(
    path:str=None, # Override default results path
)->Report: # Reports self for method chaining

Save report state to JSON


load_report


def load_report(
    id:str, # Report ID (hash)
    path:str='data/results', # Results directory
)->Report: # The loaded Report

Load a saved Report by id

Resuming from checkpoint

report = load_report('49d2fba781b6a7c0d94577479636ee6f', path='files/test/results')
report

Report: Final Evaluation of the EU-IOM Joint Initiative for migrant protection and reintegration in the horn of Africa

Year: 2023 | Organization: IOM
ID: 49d2fba781b6a7c0d94577479636ee6f

Processing Status:
✓ PDF downloaded | ✓ Markdown converted | ✓ Sections extracted (~8794 tokens) | ✓ Mappings: enablers, ccps, gcm, outputs, enbs, gcms, outs

Documents: 5 available

Pipeline Methods

The pipeline has four main stages: downloadocrextractmap_*

Download

Downloads the evaluation PDF from IOM’s repository.


Report.download


def download(
    dst:str='data/pdfs', # Destination directory for PDFs
    force:bool=False, # Force re-download
)->Report: # Self for chaining

Download evaluation PDF to dst/eval_id/

url = "https://evaluation.iom.int/sites/g/files/tmzbdl151/files/docs/resources/Abridged%20Evaluation%20Report_%20Final_Olta%20NDOJA.pdf"
report = Report.from_url(url, evals, results_path='files/test/results')
_ = report.download(dst='files/test/pdfs')
Path('files/test/pdfs/49d2fba781b6a7c0d94577479636ee6f').ls()
[Path('files/test/pdfs/49d2fba781b6a7c0d94577479636ee6f/Abridged%20Evaluation%20Report_%20Final_Olta%20NDOJA.pdf'), Path('files/test/pdfs/49d2fba781b6a7c0d94577479636ee6f/Final%20Evaluation%20Report%20Final_Olta%20NDOJA.pdf'), Path('files/test/pdfs/49d2fba781b6a7c0d94577479636ee6f/HoA%20EU%20JI%20Final%20Eval%20-%20Management%20Response%20Matrix%20-%20Final.pdf'), Path('files/test/pdfs/49d2fba781b6a7c0d94577479636ee6f/ISP_IOM_Case-Management-Return-Reintegr-JI-Review_final.pdf'), Path('files/test/pdfs/49d2fba781b6a7c0d94577479636ee6f/Evaluation%20Learning%20Brief_Final_Olta%20NDOJA.pdf')]

OCR

Runs OCR on the PDF using Mistral’s API and converts to markdown with proper heading hierarchy.


Report.ocr


def ocr(
    dst:str='data/md', # Destination directory for markdown files
    add_img_desc:bool=True, # Whether to add image descriptions
    force:bool=False, # Force re-OCR
    kwargs:VAR_KEYWORD
)->Report: # Self for chaining

Run OCR on PDF and fix heading hierarchy

await report.ocr(dst='files/test/md', add_img_desc=False, force=False)
mistocr.pipeline - INFO - Step 1/2: Running OCR on files/test/pdfs/49d2fba781b6a7c0d94577479636ee6f/Abridged%20Evaluation%20Report_%20Final_Olta%20NDOJA.pdf...
mistocr.core - INFO - Waiting for batch job f9e8c79f-b642-4feb-aa0e-471fbb004296 (initial status: QUEUED)
mistocr.core - DEBUG - Job f9e8c79f-b642-4feb-aa0e-471fbb004296 status: QUEUED (elapsed: 0s)
mistocr.core - DEBUG - Job f9e8c79f-b642-4feb-aa0e-471fbb004296 status: RUNNING (elapsed: 2s)
mistocr.core - INFO - Job f9e8c79f-b642-4feb-aa0e-471fbb004296 completed with status: SUCCESS
mistocr.pipeline - INFO - Step 2/2: Fixing heading hierarchy...
mistocr.pipeline - INFO - Done!

Report: Final Evaluation of the EU-IOM Joint Initiative for migrant protection and reintegration in the horn of Africa

Year: 2023 | Organization: IOM
ID: 49d2fba781b6a7c0d94577479636ee6f

Processing Status:
✓ PDF downloaded | ✓ Markdown converted | ✓ Sections extracted (~12640 tokens) | ✓ Mappings: enablers, ccps, gcm, outputs

Documents: 5 available

report.md_path
Path('files/test/md/49d2fba781b6a7c0d94577479636ee6f')
report.md_path.ls()[:2]
[Path('files/test/md/49d2fba781b6a7c0d94577479636ee6f/page_21.md'), Path('files/test/md/49d2fba781b6a7c0d94577479636ee6f/page_15.md')]

Extract

Extracts key sections (executive summary, findings, recommendations, conclusions) from the markdown.


Report.extract


def extract(
    force:bool=False, # Force re-extraction
    kwargs:VAR_KEYWORD
)->Report: # Self for chaining

Extract core sections from markdown

report.extract()

Report: Final Evaluation of the EU-IOM Joint Initiative for migrant protection and reintegration in the horn of Africa

Year: 2023 | Organization: IOM
ID: 49d2fba781b6a7c0d94577479636ee6f

Processing Status:
✓ PDF downloaded | ✓ Markdown converted | ✓ Sections extracted (~12640 tokens) | ✓ Mappings: enablers, ccps, gcm, outputs

Documents: 5 available

print(report.sections[:1000])
## KEY FINDINGS & CONCLUSIONS ... page 2

The JI-HoA supported migrants in dangerous environments with basic emergency care. It provided returnees (who returned with nothing) with a basis to start their reintegration and enhanced economic opportunities for communities as a whole. Therefore, the JI-HoA addressed key priorities of vulnerable populations in the region. Only sporadic examples were found where specific interventions did not meet the needs of the beneficiaries.

While governments where highly interested in the JI-HoA, urgent problems such as COVID-19, security issues and economic crises prevented governments from treating return migration as a priority.

The evaluation found extensive examples of collaboration and alignment between the JI-HoA and other initiatives. The JI-HoA aligned with priorities of regional partners (IGAD).

The Programme has in general met the targets for the specific objectives and their associated results and in many cases even surpassing its targets.

Thematic mapping

Map extracted sections to IOM’s strategic frameworks (SRF and GCM). Each mapping method can be run independently after extract().


Report.ensure_sys_blocks


def ensure_sys_blocks(
    
)->None: # Modifies self in place

Ensure system blocks are available


map_single


def map_single(
    sys_blocks, # System blocks from mk_system_blocks
    theme_type, # One of: 'enbs', 'ccps', 'gcms', 'outs'
    path:NoneType=None, # Path to theme files
    model:str='claude-haiku-4-5', # Model to use for mapping
    gcm_ids:NoneType=None, # GCM IDs for output mapping
    response_format:ModelMetaclass=ThemeScores, # Pydantic model for structured output
    max_tokens:int=8192, # Max tokens for completion
    temperature:float=0, # Temperature for completion
    reasoning_effort:str=None, # Reasoning effort for completion (low, medium, high)
)->dict: # Mapping results

Map system blocks (Report) to a single theme type using appropriate prompts and formatting

Map enablers

Maps to Strategic Results Framework enablers (organizational capabilities).


Report.map_enbs


def map_enbs(
    force:bool=False, # Re-run even if already completed
    kwargs:VAR_KEYWORD
)->Report: # Self for chaining

Map report sections to Strategic Results Framework enablers

Here let’s consider we don’t want to start the pipeline from scratch and want to resume where we left some time ago:

# Resuming where left
report = load_report('49d2fba781b6a7c0d94577479636ee6f', path='files/test/results')

# Truncating report for testing purpose throughout the module
report.sections = report.sections[:1000]

# Mapping enablers
report.map_enbs(model='claude-haiku-4-5', force=False)

Report: Final Evaluation of the EU-IOM Joint Initiative for migrant protection and reintegration in the horn of Africa

Year: 2023 | Organization: IOM
ID: 49d2fba781b6a7c0d94577479636ee6f

Processing Status:
✓ PDF downloaded | ✓ Markdown converted | ✓ Sections extracted (~227 tokens) | ✓ Mappings: enablers, ccps, gcm, outputs, enbs, gcms, outs

Documents: 5 available

sort_by_relevance(report.mappings['enbs'])[:2]
[{'theme_id': '2',
  'theme_title': 'Partnership',
  'relevance_score': 0.71,
  'reasoning': "This report would likely contribute relevant evidence to a synthesis on Enabler 2 (Partnership). The EU-IOM Joint Initiative itself is fundamentally a partnership-based programme, explicitly designed around 'facilitation of dignified voluntary return and the implementation of development-focused and sustainable reintegration policies and processes' in collaboration with multiple Khartoum Process countries (Djibouti, Ethiopia, Somalia, and Sudan). The programme's structure—involving coordination across multiple national contexts and presumably engagement with government and local actors—suggests that partnership mechanisms, coordination frameworks, and multi-country collaboration are likely substantive themes in the full evaluation. The mention of a Regional Coordination Unit coordinating across four countries indicates that the evaluation probably examines IOM's partnership and coordination capacities across national and regional levels. However, the excerpt provided is introductory and does not yet reveal the depth of partnership analysis. A synthesis specialist would likely want to review the findings and conclusions sections to assess how thoroughly the evaluation examines partnership effectiveness, equity, coordination mechanisms, and IOM's capacity to work with national and local actors—all central to Enabler 2. The programme's explicit focus on multi-country coordination and reintegration suggests partnership analysis is probable but not yet confirmed in this excerpt."},
 {'theme_id': '4',
  'theme_title': 'Data and evidence',
  'relevance_score': 0.35,
  'reasoning': "Some limited content in this report may have marginal relevance for a synthesis on Enabler 4 (Data and Evidence). The evaluation's focus on 'voluntary return and reintegration' in the Horn of Africa region suggests that the programme likely involved data collection on migrant flows, return patterns, and reintegration outcomes—areas that could touch on IOM's data systems and evidence use. The mention of a Final Independent Evaluation covering 2017-2022 indicates that the evaluation probably assessed programme outcomes and effectiveness, which would require evidence-based analysis. However, the excerpt provided is purely introductory and does not reveal whether the evaluation substantively examines IOM's data collection methodologies, data management systems, data quality, data sharing practices, or organizational data fluency. The evaluation's focus appears to be on programme implementation and outcomes rather than on IOM's organizational data and evidence capacities as a distinct theme. A synthesis specialist focused on IOM's data systems would likely find this report supplementary at best, unless the full evaluation contains dedicated analysis of how data informed programme decisions or how IOM's data systems performed in this context. Recommend reviewing the findings section to determine if substantive data systems analysis is present."}]

Map CCPs

Maps to SRF Cross-Cutting Priorities.


Report.map_ccps


def map_ccps(
    force:bool=False, # Re-run even if already completed
    kwargs:VAR_KEYWORD
)->Report: # Self for chaining

Map report sections to Strategic Results Framework cross-cutting priorities

report.map_ccps(model='claude-haiku-4-5', force=False)

Report: Final Evaluation of the EU-IOM Joint Initiative for migrant protection and reintegration in the horn of Africa

Year: 2023 | Organization: IOM
ID: 49d2fba781b6a7c0d94577479636ee6f

Processing Status:
✓ PDF downloaded | ✓ Markdown converted | ✓ Sections extracted (~12640 tokens) | ✓ Mappings: enablers, ccps, gcm, outputs, enbs

Documents: 5 available

sort_by_relevance(report.mappings['ccps'])[:2]
[{'theme_id': '3',
  'theme_title': 'Protection-centred',
  'relevance_score': 0.78,
  'reasoning': "This report would likely contribute meaningful evidence to a synthesis on Cross-cutting Priority 3 (Protection-centred approaches). The evaluation explicitly examines IOM's protection commitments across return and reintegration programming, with substantive analysis appearing in multiple sections. Section 4.3.2 is dedicated to 'Safe, humane, dignified voluntary return processes,' directly assessing IOM's core protection responsibility to ensure rights-based, dignity-centred return. The report documents extensive evidence on protection outcomes: Section 4.3.2.1 notes that 95% of assisted migrants were satisfied with travel arrangements and 99.6% felt travel was safe and well-organized; focus groups confirm returnees noted 'their return would not have been possible without IOM' (Section 4.3.2.2). Key findings on vulnerability-informed support appear throughout: Section 4.1.1.1 describes how the programme identified vulnerabilities and provided individualized support; Section 4.3.3.1 discusses psychosocial support for beneficiaries affected by trauma. However, the report identifies protection gaps: Section 4.3.2.2 notes 'gaps remain with regards to coordination mechanisms and referral partners for specialized services,' and Section 4.3.1.2 discusses insufficient post-return psychosocial support services. The Recommendations (particularly Recommendation 6 on 'safe, humane, and orderly migration pathways') underscore protection-centred priorities. A synthesis specialist would find the sections on AVRR processes, vulnerable population responsiveness, safeguarding of returnees through integrated support, and identified protection gaps to be particularly valuable for understanding IOM's protection-centred implementation. The report's depth on dignity in return processes and identification of protection barriers makes it a solid source for evidence synthesis on this priority."},
 {'theme_id': '2',
  'theme_title': 'Equality, Diversity & Inclusion',
  'relevance_score': 0.62,
  'reasoning': "This report contains some relevant content for a synthesis on Cross-cutting Priority 2 (Equality, Diversity & Inclusion), though this commitment area is not a primary evaluation focus. The report demonstrates awareness of vulnerable populations and differentiated needs: Section 4.1.1.1 notes that the programme responded to pressing needs of migrants including those in detention and dangerous environments; Section 4.1.1.2 discusses returnees facing stigma and exclusion from communities; and the evaluation documents gender-specific concerns (mentions of sexual and gender-based violence awareness-raising in Section 4.3.2.1). The programme's integrated approach to economic, social, and psychosocial support could be interpreted as responsive to diverse vulnerability profiles. However, equality, diversity, and inclusion are not systematically analyzed as distinct institutional commitments. The report does not examine gender analysis in programme design, disability inclusion, age-specific approaches, ethnic or racial equity considerations, or meaningful participation mechanisms for diverse population groups. While Section 4.1.1.1 briefly mentions gender mainstreaming through SGBV awareness activities, this appears as one activity rather than evidence of systematic gender integration. A synthesis specialist examining IOM's EDI implementation would likely find this report supplementary rather than essential—it shows some awareness of vulnerable populations but lacks the depth and systematic analysis of inclusion practices that would make it a priority for review."}]

Map GCM objectives

Maps to Global Compact for Migration objectives.


Report.map_gcms


def map_gcms(
    force:bool=False, # Re-run even if already completed
    kwargs:VAR_KEYWORD
)->Report: # Self for chaining

Map report sections to Global Compact for Migration objectives

report.map_gcms(model='claude-haiku-4-5', force=False)

Report: Final Evaluation of the EU-IOM Joint Initiative for migrant protection and reintegration in the horn of Africa

Year: 2023 | Organization: IOM
ID: 49d2fba781b6a7c0d94577479636ee6f

Processing Status:
✓ PDF downloaded | ✓ Markdown converted | ✓ Sections extracted (~227 tokens) | ✓ Mappings: enablers, ccps, gcm, outputs, enbs, gcms, outs

Documents: 5 available

sort_by_relevance(report.mappings['gcms'])[:2]
[{'theme_id': '21',
  'theme_title': 'Cooperate In Facilitating Safe And Dignified Return And Readmission, As Well As Sustainable Reintegration',
  'relevance_score': 0.89,
  'reasoning': "This report would likely be essential for a synthesis on GCM Objective 21. The EU-IOM Joint Initiative's explicit objective is 'to contribute to facilitating orderly, safe, regular and rights-based migration through the facilitation of dignified voluntary return and the implementation of development-focused and sustainable reintegration policies and processes.' This directly aligns with the core themes of GCM Objective 21 and its associated actions on safe and dignified return (Actions 21a, 21e), gender-responsive and child-sensitive return programmes (21b), consular assistance (21d), sustainable reintegration (21h), and community needs assessment (21i). The programme's focus on 'dignified voluntary return' and 'sustainable reintegration' indicates that the evaluation would provide substantive analysis of return facilitation mechanisms, reintegration support, and outcomes. The evaluation's coverage of the 2017-2022 period across four countries (Djibouti, Ethiopia, Somalia, Sudan) would provide evidence on implementation effectiveness, challenges, and best practices in return and reintegration programming. For a synthesis specialist, this report would be a priority resource. Key sections to review would include findings on return facilitation processes, reintegration outcomes, beneficiary experiences, and recommendations on strengthening return and reintegration systems."},
 {'theme_id': '23',
  'theme_title': 'Strengthen International Cooperation And Global Partnerships For Safe, Orderly And Regular Migration',
  'relevance_score': 0.73,
  'reasoning': "This report would likely contribute relevant evidence to a synthesis on GCM Objective 23. The EU-IOM Joint Initiative is itself a multilateral partnership mechanism involving the EU and IOM, and the programme's regional coordination structure (Regional Coordination Unit based in Nairobi) suggests engagement with international cooperation and partnership frameworks. The evaluation would likely examine how the programme facilitates cooperation between participating countries, international organizations, and other stakeholders in implementing return and reintegration policies. This relates to Actions 23a (supporting other States in implementation), 23b (international cooperation on migration issues), and 23e (bilateral and multilateral partnerships). However, international cooperation and partnership-building are examined primarily through the lens of return and reintegration implementation rather than as a distinct policy focus. The report would provide evidence on cooperation mechanisms and partnership effectiveness, but a synthesis specialist would likely prioritize sources explicitly focused on international cooperation frameworks and capacity-building. Key sections to review would be findings on inter-country cooperation, partnership effectiveness, and recommendations on strengthening international coordination."}]

Map outputs

Maps to SRF outputs. If gcm_ids not provided, uses the top GCM objective from prior mapping.


Report.map_outs


def map_outs(
    gcm_ids:NoneType=None, # GCM IDs to filter SRF objectives
    force:bool=False, # Re-run even if already completed
    kwargs:VAR_KEYWORD
)->Report: # Self for chaining

Map report sections to Strategic Results Framework outputs

report.map_outs(model='claude-haiku-4-5', force=False)

Report: Final Evaluation of the EU-IOM Joint Initiative for migrant protection and reintegration in the horn of Africa

Year: 2023 | Organization: IOM
ID: 49d2fba781b6a7c0d94577479636ee6f

Processing Status:
✓ PDF downloaded | ✓ Markdown converted | ✓ Sections extracted (~227 tokens) | ✓ Mappings: enablers, ccps, gcm, outputs, enbs, gcms, outs

Documents: 5 available

sort_by_relevance(report.mappings['outs'])[:2]
[{'theme_id': '2b63',
  'theme_title': 'Returning migrants and returning, relocated and locally integrating displaced persons receive reintegration assistance in line with their needs and those of broader community members.',
  'relevance_score': 0.88,
  'reasoning': "This report would likely be essential for a synthesis on Output 2b63 (reintegration assistance for returning migrants and displaced persons). The EU-IOM Joint Initiative in the Horn of Africa explicitly focuses on 'dignified voluntary return and the implementation of development-focused and sustainable reintegration policies and processes' as stated in the introduction. The evaluation covers the 2017-2022 period of this initiative, which directly targets reintegration assistance as a core deliverable. The program's stated objective of facilitating 'orderly, safe, regular and rights-based migration through... sustainable reintegration policies and processes' indicates that reintegration outcomes are a primary focus of the evaluation. For a synthesis specialist, the findings and conclusions sections would likely contain substantial evidence on reintegration assistance delivery, effectiveness, and community integration outcomes. The report's coverage of the four JI core countries (Djibouti, Ethiopia, Somalia, Sudan) provides geographic specificity on reintegration programming in the Horn of Africa region. Recommend reviewing the findings section for evidence on reintegration assistance mechanisms, beneficiary outcomes, and sustainability of reintegration support."},
 {'theme_id': '3a52',
  'theme_title': 'Migrants of all genders, ages, abilities and other diversities benefit from appropriate and gender-sensitive pre-departure, post-arrival or return assistance and counselling.',
  'relevance_score': 0.81,
  'reasoning': "This report would likely contribute relevant evidence to a synthesis on Output 3a52 (pre-departure, post-arrival, and return assistance and counseling). The EU-IOM Joint Initiative's explicit focus on 'dignified voluntary return' and 'sustainable reintegration' indicates that pre-departure counseling, return assistance, and post-return support are core program components. The evaluation would likely assess the quality, appropriateness, and gender-sensitivity of assistance and counseling services provided to returning migrants. The program's emphasis on 'rights-based migration' and 'orderly, safe, regular' return processes suggests that counseling and individualized support are central to program delivery. However, assistance and counseling appear as part of the broader reintegration assistance agenda rather than as an isolated focus. A synthesis specialist would likely find substantive content on counseling mechanisms, assistance delivery approaches, and their effectiveness in supporting migrants through return and reintegration processes. Recommend reviewing findings sections on return assistance packages, counseling services, gender-sensitive programming, and beneficiary satisfaction with support services."}]

Map all themes

Convenience method to run all mapping stages in sequence.


Report.map_all


def map_all(
    kwargs:VAR_KEYWORD
)->Report: # Self for chaining

Run all theme mappings in sequence

Run full pipeline

Run all four stages in sequence on a single evaluation report. This is a convenience wrapper around the individual Report methods for processing reports from start to finish.


should_force


def should_force(
    force, # Bool to force all steps, or set of step names to force
    step, # Step name to check
)->bool: # Whether to force the step

Check if step should be forced - handles bool or set of step names


run_pipeline


def run_pipeline(
    url:str, # URL of the evaluation PDF
    evals:list, # List of `Evaluation` objects to search
    pdf_dst:str='data/pdfs', # Destination directory for PDFs
    md_dst:str='data/md', # Destination directory for markdown files
    results_path:str='data/results', # Path to save/load results
    ocr_kwargs:dict=None, # Additional arguments passed to ocr (e.g. add_img_desc, model)
    force:bool | set=False, # Force re-run: True for all, or set of step names {'download','ocr','extract','enbs','ccps','gcms','outs'}
    kwargs:VAR_KEYWORD
)->Report: # Fully processed report with all mappings

Run complete pipeline: download → ocr → extract → map_themes

evals = load_evals('files/test/evaluations.json')
url = "https://evaluation.iom.int/sites/g/files/tmzbdl151/files/docs/resources/Abridged%20Evaluation%20Report_%20Final_Olta%20NDOJA.pdf"
report = await run_pipeline(
    url, 
    evals, 
    pdf_dst='files/test/pdfs', 
    md_dst='files/test/md', 
    results_path='files/test/results', 
    ocr_kwargs=dict(add_img_desc=False), 
    force=True,
    model='claude-haiku-4-5'
    )

report
__main__ - INFO - Creating report from URL...
__main__ - INFO - Step 1/7: Downloading PDF...
__main__ - INFO - Step 2/7: Running OCR...
mistocr.pipeline - INFO - Step 1/2: Running OCR on files/test/pdfs/49d2fba781b6a7c0d94577479636ee6f/Abridged%20Evaluation%20Report_%20Final_Olta%20NDOJA.pdf...
mistocr.core - INFO - Waiting for batch job b0b5bc08-4533-4101-8954-f70d39ddc2b1 (initial status: QUEUED)
mistocr.core - DEBUG - Job b0b5bc08-4533-4101-8954-f70d39ddc2b1 status: QUEUED (elapsed: 0s)
mistocr.core - DEBUG - Job b0b5bc08-4533-4101-8954-f70d39ddc2b1 status: RUNNING (elapsed: 2s)
mistocr.core - INFO - Job b0b5bc08-4533-4101-8954-f70d39ddc2b1 completed with status: SUCCESS
mistocr.pipeline - INFO - Step 2/2: Fixing heading hierarchy...
mistocr.pipeline - INFO - Done!
__main__ - INFO - Step 3/7: Extracting sections...
__main__ - INFO - Step 4/7: Mapping enablers...
__main__ - INFO - Step 5/7: Mapping CCPs...
__main__ - INFO - Step 6/7: Mapping GCM objectives...
__main__ - INFO - Step 7/7: Mapping outputs...
__main__ - INFO - Pipeline complete!

Report: Final Evaluation of the EU-IOM Joint Initiative for migrant protection and reintegration in the horn of Africa

Year: 2023 | Organization: IOM
ID: 49d2fba781b6a7c0d94577479636ee6f

Processing Status:
✓ PDF downloaded | ✓ Markdown converted | ✓ Sections extracted (~8794 tokens) | ✓ Mappings: enablers, ccps, gcm, outputs, enbs, gcms, outs

Documents: 5 available

Or, if we prefer to re-run (force) only the last two steps:

report = await run_pipeline(
    url, 
    evals, 
    pdf_dst='files/test/pdfs', 
    md_dst='files/test/md', 
    results_path='files/test/results', 
    ocr_kwargs=dict(add_img_desc=False), 
    force={'gcms', 'outs'},  # Only re-run these two steps (GCM objectives and SRF outputs)
    model='claude-haiku-4-5'
)

report
__main__ - INFO - Creating report from URL...
__main__ - INFO - Step 1/7: Downloading PDF...
__main__ - INFO - Step 2/7: Running OCR...
__main__ - INFO - Step 3/7: Extracting sections...
__main__ - INFO - Step 4/7: Mapping enablers...
__main__ - INFO - Step 5/7: Mapping CCPs...
__main__ - INFO - Step 6/7: Mapping GCM objectives...