← Back to all briefings
AI 6 min read Published Updated Credibility 73/100

OpenAI Releases GPT-4 with Multimodal Capabilities

GPT-4's multimodal release in March 2023 added image understanding to language capabilities. Longer context windows and improved reasoning expanded enterprise use cases. This set the bar for foundation model capabilities.

Fact-checked and reviewed — Kodi C.

AI pillar illustration for Zeph Tech briefings
AI deployment, assurance, and governance briefings

OpenAI released GPT-4 on March 14, 2023, marking a significant advancement in large language model capabilities. GPT-4 accepts both text and image inputs, processes visual information alongside natural language, and generates text responses. The model shows improvements across reasoning, factual accuracy, safety alignment, and reduced harmful outputs compared to GPT-3.5. Microsoft integrated GPT-4 into Bing Chat, while OpenAI made it available through ChatGPT Plus subscriptions and API access with waitlist.

Performance Benchmarks and Capabilities

GPT-4 achieved remarkable results on standardized tests, scoring in the 90th percentile on the Uniform Bar Exam (simulated), 93rd percentile on SAT Reading, and 89th percentile on SAT Math. The model reached a perfect score (5/5) on AP Calculus BC, AP Psychology, AP Statistics, and AP U.S. History exams. On professional exams, GPT-4 passed the simulated Sommelier, CFA Level II, and medical licensing examinations.

The model's reasoning capabilities improved significantly over GPT-3.5. On the MMLU (Massive Multitask Language Understanding) benchmark testing knowledge across 57 subjects, GPT-4 scored 86.4% compared to GPT-3.5's 70.0%. The model shows better contextual understanding, nuanced instruction following, and ability to handle complex multi-step reasoning tasks. GPT-4 supports context windows up to 32,768 tokens (approximately 24,000 words), enabling analysis of long documents.

Multimodal Vision Capabilities

GPT-4's image understanding capability processes photographs, diagrams, screenshots, and documents containing text and visuals. The model interprets charts and graphs, reads handwritten text, explains memes and jokes requiring visual context, and analyzes spatial relationships in images. Organizations use vision capabilities for document processing, visual question answering, accessibility tools describing images for visually impaired users, and educational applications.

OpenAI initially limited image input access to select partners including Be My Eyes (assistive technology for blind users) and academic researchers. The vision capability enables applications like analyzing architectural plans, troubleshooting technical issues from screenshots, processing invoices and receipts, and generating code from UI mockups.

Safety and Alignment Improvements

OpenAI implemented six months of safety training using adversarial testing and reinforcement learning from human feedback (RLHF). GPT-4 is 82% less likely to respond to requests for disallowed content compared to GPT-3.5, and 40% more likely to produce factual responses according to internal evaluations. The company engaged external experts in AI safety, cybersecurity, and adversarial testing to identify risks before release.

The model exhibits reduced hallucinations and improved calibration—knowing when it lacks information rather than fabricating plausible-sounding false information. GPT-4 incorporates rule-based reward models guiding model behavior, with fine-tuning on human preferences for helpful, harmless, and honest outputs. OpenAI published a technical report detailing safety work, limitations, and failure modes to inform responsible deployment.

Enterprise API Access and Integration

OpenAI launched GPT-4 API access through a waitlist, prioritizing developers demonstrating track records of building with GPT-3.5 and implementing safety good practices. The API offers two variants: gpt-4 (8,192 token context window) and gpt-4-32k (32,768 token context). Pricing for gpt-4 is $0.03 per 1K prompt tokens and $0.06 per 1K completion tokens, while gpt-4-32k costs $0.06/$0.12 per 1K tokens respectively.

Early enterprise adopters included Duolingo (conversational language learning), Khan Academy (tutoring assistant Khanmigo), Morgan Stanley (knowledge base search), and Stripe (fraud detection and support automation). The model's improved reasoning enables more complex enterprise applications including legal document analysis, financial modeling, medical diagnosis support, and sophisticated coding assistance.

Technical Architecture and Training

While OpenAI did not disclose specific architecture details, GPT-4 represents a transformer-based language model trained on diverse internet text, books, and proprietary datasets. The company emphasized post-training work including RLHF, Constitutional AI methods, and extensive safety testing. GPT-4 training completed in August 2022, with subsequent months dedicated to safety improvements and alignment research.

The model incorporates improved tokenization, better handling of multilingual content, and improved code generation capabilities. GPT-4 shows strong performance across Python, JavaScript, TypeScript, and other programming languages, with ability to explain code, identify bugs, and suggest improvements. The model supports function calling, enabling integration with external tools, APIs, and databases.

Limitations and Known Issues

OpenAI documented several limitations in the GPT-4 technical report. The model exhibits social biases present in training data, though less pronounced than GPT-3.5. GPT-4 still hallucinates facts and makes reasoning errors, particularly on complex mathematical proofs and edge cases. The model has a knowledge cutoff (September 2021 initially), lacking awareness of recent events.

GPT-4 can produce harmful content when users bypass safety guardrails, requiring ongoing safety monitoring. The model occasionally makes simple mistakes in arithmetic and logical reasoning that humans would not make. Performance degrades on adversarially constructed inputs designed to expose weaknesses. OpenAI committed to iterative deployment, collecting usage data to identify failure modes and improve safety.

Strategic Implications for Organizations

CTIOs should evaluate GPT-4 for applications requiring advanced reasoning, complex instruction following, and multimodal understanding. The model enables new use cases including visual document processing, sophisticated coding assistance, and applications requiring long context understanding. Organizations must balance capabilities against costs—GPT-4 is 15-30x more expensive than GPT-3.5, requiring careful cost improvement.

Technical teams should implement strong testing frameworks to validate GPT-4 outputs, design human-in-the-loop review processes for high-stakes decisions, and establish monitoring systems detecting hallucinations and errors. Organizations must develop prompt engineering expertise, fine-tune applications for specific domains, and implement fallback strategies when GPT-4 produces incorrect outputs. Data governance policies should address data sent to OpenAI APIs, including data residency, retention, and confidentiality requirements.

Continue in the AI pillar

Return to the hub for curated research and deep-dive guides.

Visit pillar hub

Latest guides

Coverage intelligence

Published
Coverage pillar
AI
Source credibility
73/100 — medium confidence
Topics
large language models · multimodal AI · GPT-4 · safety
Sources cited
3 sources (arxiv.org, openai.com, iso.org)
Reading time
6 min

Source material

  1. GPT-4 Technical Report
  2. GPT-4
  3. ISO/IEC 42001:2023 — Artificial Intelligence Management System — International Organization for Standardization
  • large language models
  • multimodal AI
  • GPT-4
  • safety
Back to curated briefings

Comments

Community

We publish only high-quality, respectful contributions. Every submission is reviewed for clarity, sourcing, and safety before it appears here.

    Share your perspective

    Submissions showing "Awaiting moderation" are in review. Spam, low-effort posts, or unverifiable claims will be rejected. We verify submissions with the email you provide, and we never publish or sell that address.

    Verification

    Complete the CAPTCHA to submit.