The Imaginative and prescient framework has lengthy included textual content recognition capabilities. We have already got a detailed tutorial that exhibits you easy methods to scan a picture and carry out textual content recognition utilizing the Imaginative and prescient framework. Beforehand, we utilized VNImageRequestHandler
and VNRecognizeTextRequest
to extract textual content from a picture.
Through the years, the Imaginative and prescient framework has developed considerably. In iOS 18, Imaginative and prescient introduces new APIs that leverage the ability of Swift 6. On this tutorial, we’ll discover easy methods to use these new APIs to carry out textual content recognition. You can be amazed by the enhancements within the framework, which prevent a big quantity of code to implement the identical characteristic.

As at all times, we’ll create a demo utility to information you thru the APIs. We’ll construct a easy app that enables customers to pick out a picture from the picture library, and the app will extract the textual content from it in actual time.
Let’s get began.
Loading the Photograph Library with PhotosPicker
Assuming you’ve created a brand new SwiftUI undertaking on Xcode 16, go to ContentView.swift
and begin constructing the essential UI of the demo app:
import SwiftUI
import PhotosUI
struct ContentView: View {
@State non-public var selectedItem: PhotosPickerItem?
@State non-public var recognizedText: String = "No textual content is detected"
var physique: some View {
VStack {
ScrollView {
VStack {
Textual content(recognizedText)
}
}
.contentMargins(.horizontal, 20.0, for: .scrollContent)
Spacer()
PhotosPicker(choice: $selectedItem, matching: .photographs) {
Label("Choose a photograph", systemImage: "picture")
}
.photosPickerStyle(.inline)
.photosPickerDisabledCapabilities([.selectionActions])
.body(peak: 400)
}
.ignoresSafeArea(edges: .backside)
}
}
We make the most of PhotosPicker to entry the picture library and cargo the photographs within the decrease a part of the display. The higher a part of the display incorporates a scroll view for show the acknowledged textual content.

We have now a state variable to maintain monitor of the chosen picture. To detect the chosen picture and cargo it as Information, you possibly can connect the onChange modifier to the PhotosPicker view like this:
.onChange(of: selectedItem) { oldItem, newItem in
Job {
guard let imageData = strive? await newItem?.loadTransferable(kind: Information.self) else {
return
}
}
}
Textual content Recognition with Imaginative and prescient
The brand new APIs within the Imaginative and prescient framework have simplified the implementation of textual content recognition. Imaginative and prescient affords 31 completely different request sorts, every tailor-made for a particular type of picture evaluation. For example, DetectBarcodesRequest
is used for figuring out and decoding barcodes. For our functions, we shall be utilizing RecognizeTextRequest
.
Within the ContentView
struct, add an import assertion to import Imaginative and prescient
and create a brand new perform named recognizeText
:
non-public func recognizeText(picture: UIImage) async {
guard let cgImage = picture.cgImage else { return }
let textRequest = RecognizeTextRequest()
let handler = ImageRequestHandler(cgImage)
do {
let end result = strive await handler.carry out(textRequest)
let recognizedStrings = end result.compactMap { statement in
statement.topCandidates(1).first?.string
}
recognizedText = recognizedStrings.joined(separator: "n")
} catch {
recognizedText = "Did not acknowledged textual content"
print(error)
}
}
This perform takes in an UIImage
object, which is the chosen picture, and extract the textual content from it. The RecognizeTextRequest
object is designed to establish rectangular textual content areas inside a picture.
The ImageRequestHandler
object processes the textual content recognition request on a given picture. Once we name its carry out
perform, it returns the outcomes as RecognizedTextObservation
objects, every containing particulars in regards to the location and content material of the acknowledged textual content.
We then use compactMap
to extract the acknowledged strings. The topCandidates
technique returns the very best matches for the acknowledged textual content. By setting the utmost variety of candidates to 1, we be certain that solely the highest candidate is retrieved.
Lastly, we use the joined
technique to concatenate all of the acknowledged strings.
With the recognizeText
technique in place, we will replace the onChange
modifier to name this technique, performing textual content recognition on the chosen picture.
.onChange(of: selectedItem) { oldItem, newItem in
Job {
guard let imageData = strive? await newItem?.loadTransferable(kind: Information.self) else {
return
}
await recognizeText(picture: UIImage(information: imageData)!)
}
}
With the implementation full, now you can run the app in a simulator to check it out. When you’ve got a photograph containing textual content, the app ought to efficiently extract and show the textual content on display.

Abstract
With the introduction of the brand new Imaginative and prescient APIs in iOS 18, we will now obtain textual content recognition duties with exceptional ease, requiring just a few traces of code to implement. This enhanced simplicity permits builders to shortly and effectively combine textual content recognition options into their purposes.
What do you concentrate on this enchancment of the Imaginative and prescient framework? Be happy to depart remark beneath to share your thought.