top of page

Methods for recording cultural heritage are expanding from approaches centered on photo ledgers, plan views, and cross-sections to more three-dimensional, reusable digital records. Among these, SfM is attracting attention as a technique that can reconstruct three-dimensional shapes from multiple photographs, and is being used in various contexts such as heritage surveys, preservation records, before-and-after restoration comparisons, and visualization for exhibitions.


On the other hand, what on-site practitioners really want to know is not how excellent SfM is in theory, but to what extent it can be used in practice for 3D digitization of cultural heritage, what level of accuracy it can achieve in recording, and where its limitations lie. Even if a visually pleasing 3D model is produced, if it cannot be used for dimensional verification, fails to reproduce fine surface relief, or produces errors in assessing areas of loss, its value as a record cannot be considered sufficient.


In this article, we clarify, from a practical perspective, how far 3D digitization of cultural heritage using SfM can go, and provide a detailed explanation of four items to pay particular attention to when verifying accuracy. To be useful both to those considering implementation and to those who have already begun photographing or creating data, we dive into specifics along the workflow of capture, processing, verification, and operation.


Table of Contents

Why 3D digitization of cultural heritage using SfM is attracting attention

To what extent can cultural heritage be 3D digitized using SfM?

Accuracy check 1: Are the shooting conditions and image quality sufficient?

Accuracy Check 2: Are the photo overlap rate and shooting route appropriate?

Accuracy Check 3: Are the scale and coordinate references guaranteed?

Accuracy check 4: Can it detect missing data, distortion, and noise?

Practical considerations when using SfM in cultural heritage surveys

Summary


Why SfM-based 3D digitization of cultural heritage is attracting attention

The main reason SfM has attracted attention in the field of cultural heritage is that it enables three-dimensional recording of shape using a relatively accessible method. Conventional two-dimensional photographs are suited to recording surface condition and color tone, but they cannot directly handle depth information. For that reason, there were limits to later detailed examination of complex decorations, the depth of carvings, warping of components, and surface changes caused by weathering.


By contrast, SfM estimates the relative positions of the subject and reconstructs its three-dimensional shape based on matching points between photos taken from multiple directions. In other words, it offers significant value not only in preserving appearance but in preserving form. For objects such as cultural heritage, which cannot be restored once damaged, recording the current condition from as many angles as possible is itself highly meaningful. High reusability in downstream processes—such as records before restoration, comparisons of aging changes, inspections after disasters, and the preparation of materials for public release—is another major advantage.


Additionally, SfM is valued for being relatively flexible to deploy in the field. Even without large measurement equipment, shooting methods can be easily adjusted to suit the scale and purpose of the subject, and it can be applied to a variety of targets such as small stone monuments, Buddhist statues, wall decorations, architectural elements, and parts of archaeological remains. The fact that work can be carried out mainly through photography—even in narrow spaces, locations that are difficult to bring equipment into, or places where contact should be avoided—makes it well suited to cultural heritage surveys.


However, being in the spotlight does not mean that SfM can accurately 3D-scan anything. Cultural heritage items vary greatly from one subject to another, and results are heavily influenced by material, shape, state of preservation, ambient lighting conditions, available work paths, and whether close approach is possible. Furthermore, 3D digitization of cultural heritage may aim either to create visually appealing models or to produce measurement outputs that can withstand documentation and verification, and treating these two goals with the same mindset often leads to failure.


Therefore, practitioners need to assess SfM not only for its potential but, from an accuracy standpoint, determine which uses it can be trusted for. If this is left ambiguous in operation, the resulting data may be usable as internal reference material for sharing but could become unsuitable for formal records or comparative verification. To make effective use of SfM in the 3D digitization of cultural heritage, the design—deciding under what conditions and to what extent reproducibility is targeted—is more important than the method itself.


How far can cultural heritage be 3D modeled using SfM?

The range of cultural heritage that can be 3D-digitized using SfM is broader than you might imagine, but its limits are also clear. SfM is relatively good at handling subjects that have surface features, where sufficient shooting directions can be secured, and where lighting changes are minimal. For example, stone carvings and weathering marks, joints in wooden elements, overlapping roof tiles, wall surface reliefs, the undulations of archaeological remains, and shadow variations on sculpture surfaces tend to yield matching points between photographs and are well suited to shape reconstruction. If you can ensure a certain level of capture density, you can create models that are useful not only for grasping the overall shape but also for observing local surface conditions.


On the other hand, the subjects it has difficulty with are also clear. Areas with uniform surfaces and few patterns, strongly reflective faces, materials that are nearly transparent, parts whose texture easily changes due to moisture, and intricate areas where it is hard to secure a shooting angle tend to make shape reconstruction unstable. For example, glossy decorative surfaces, areas where shadows are flattened in dark conditions, sections with a series of thin protrusions, recessed undersides, and places with poor visibility due to surrounding obstructions are prone to gaps or distortions.


The important point here is that what SfM excels at is three-dimensional reconstruction within the range that can be inferred from photographs, and it cannot magically fill in unseen surfaces or surfaces with little information. In other words, things that were not photographed will not be reconstructed, and things that were captured only ambiguously will remain ambiguous when turned into three dimensions. In the 3D digitization of cultural heritage, if this principle is not understood, one may place too much trust in the finished model.


The accuracy obtainable with SfM is judged by different criteria depending on the purpose. For a viewing model intended for exhibition, it may be sufficiently practical even if there are some local errors. However, for applications that require quantitative reliability—such as comparing crack widths, assessing deformation magnitudes, verifying the dimensions of repair components, or checking long-term changes—merely having a model that looks correct is not enough. You need to define in advance how much dimensional error is acceptable and which parts’ reproducibility should be given top priority.


Furthermore, cultural properties are not a uniform category. Depending on the subject—entire buildings, stone monuments, Buddhist statues, areas around murals, exposed surfaces of archaeological features, timber framing members, etc.—the required shooting distance, the demanded level of precision, and the risks that must be managed differ. What is effective for grasping overall shape may be insufficient for reading fine surface damage. Conversely, even if a method is suitable for localized high-density imaging, it can be difficult to capture wide areas with uniform quality in a short time.


Thus, while SfM can achieve 3D digitization of cultural heritage to a considerable extent, what determines its value is not the name of the method but whether it meets the accuracy required for the intended purpose. Therefore, in practice it is important to use as the judgment criterion not "whether 3D digitization was achieved with SfM" but "whether it was digitized with an accuracy usable for the intended purpose." Indispensable for making that judgment are the four accuracy-check items explained below.


Accuracy Check 1: Are the capture conditions and image quality sufficient?

The first checklist item is shooting conditions and image quality. The accuracy of SfM is not something that suddenly appears during processing; it is largely determined at the shooting stage. No matter how much you refine processing parameters, if the source images lack sufficient information, stable 3D reconstruction is impossible. One of the most common failures in cultural heritage surveys is discovering problems after model creation when retaking photos is difficult. Cultural heritage cannot always be revisited, and there may be restrictions on access or public display, so ensuring quality at the shooting stage is especially important.


The first thing to check is whether the image is sharp. Camera shake, misfocus, crushed shadows, blown highlights, strong shadows, and excessive compression reduce the accuracy of extracting corresponding points. The surfaces of cultural heritage often exhibit very subtle variations in color and shape, and even slight image quality degradation directly affects restoration quality. You should zoom in and inspect the captured images on site to confirm that surface details are clearly visible, contours are not smeared, and differences in texture have not been lost.


The next important consideration is the stability of the lighting environment. In SfM, it is desirable that the same subject appears similarly across multiple images. However, for outdoor cultural heritage sites the sun angle changes with the time of day, and shadows can change dramatically during continuous shooting. In environments where dappled sunlight or reflected light moves, the appearance of the same spot can vary from photo to photo, making correspondences unstable. On surfaces with a lot of relief, shadow differences can sometimes aid feature extraction, but if the changes are too large they become counterproductive. What matters is not the shadows themselves but that conditions do not fluctuate greatly over the entire shoot.


Consistency in shooting distance should not be overlooked. If overly close photos and overly distant photos are mixed, the resolution and appearance can change dramatically, undermining the overall uniformity of the model. If you take extreme close-ups of only parts of a cultural property and then abruptly switch to full shots, the connections can become unstable. Separating distances for overall overview and for detailed inspection is effective, but even in that case you need images at intermediate distances to bridge the two. Gradually changing the distance in steps and maintaining continuity helps ensure accuracy.


Furthermore, photographic considerations tailored to the materials of cultural properties are indispensable. Subjects with rich surface textures, such as stone or earthen remains, are relatively easy to reconstruct, whereas surfaces that are nearly monochrome or areas where features have been worn away are at a disadvantage. Even with wood, uniformly painted or glossy surfaces make feature extraction difficult. In such cases, simply increasing the number of shots may not lead to improvement; it is necessary to adjust the angle and the lighting to accurately capture surface information.


In other words, checking shooting conditions and image quality is not merely about confirming that photos were taken. It means verifying that the necessary information has been recorded in a form usable for 3D reconstruction. On site, attention tends to focus on the number of shots taken, but what determines accuracy is not the number itself, but the quality of information in each image and the overall consistency. For SfM of cultural heritage, an approach that builds accuracy at the time of shooting is more important than relying on post-processing to fix things.


Accuracy Check 2: Are the photo overlap rate and capture path appropriate?

The second checkpoint is the overlap between photos and the capture workflow. SfM assumes that adjacent photos have sufficient common areas. If this common area is insufficient, the relative positions between photos cannot be stably estimated, causing parts of the model to shift, become fragmented, or be locally distorted. When 3D-digitizing cultural heritage, people tend to assume that having a stationary subject makes things safe, but in reality, if the capture workflow is poor, accuracy can easily collapse even for a static subject.


The reason overlap rate is important is that observing the same spot multiple times from different angles provides redundancy for position estimation. For example, if a surface feature appears in only one image, the three-dimensional position of that point becomes unstable. However, if it can be confirmed consistently across multiple photos, the effects of errors are easier to average out. For objects with complex shapes, such as cultural heritage artifacts, continuous frontal shooting is often insufficient, and it is necessary to ensure overlap from oblique angles and from above and below as well.


Designing the shooting path is also extremely important. A common mistake is that the photographer moves on a whim and, even if they think they have captured the necessary angles, the images may not connect well during processing. For example, simply moving to the sides after photographing the front of the subject can weaken the connection between the frontal and side image groups. To reliably reconstruct the three-dimensional shape of cultural heritage objects, it is fundamental to proceed smoothly around the subject so that adjacent shots reliably overlap.


Also, for subjects with significant vertical variation, a shooting plan that accounts for height is necessary. Stone pagodas, statues, gates, and architectural elements, for example, will lack top and bottom information if you only circle them at a single height. Conversely, taking only downward-looking photos weakens the side geometry. To improve the accuracy of 3D reconstruction, it is effective to shoot in layered passes from multiple heights and angles rather than following a single ring-shaped path. This improves the continuity of surface geometry and reduces local gaps.


In the case of cultural properties, restrictions on approaching and on scaffolding can prevent establishing an ideal capture path. In such cases, rather than trying to force coverage of areas that cannot be captured, it is important to clarify which parts will be blind spots and to design the most stable capture path possible within the areas that can be acquired. It is dangerous to treat a completed model as if it were complete 360-degree data when some faces could not be photographed. You must clearly distinguish between acquired and missing areas and judge which portions of the model are usable accordingly.


Checking the overlap rate and shooting path is not merely a matter of shooting procedure. These factors directly affect model connectivity, low distortion, minimal missing data, and the stability of local accuracy. When you inspect point clouds or meshes in processing software, if you find problematic areas, it is important not to attribute the cause only to processing settings but to step back and verify the shooting path and any lack of overlap. For SfM of cultural heritage, it is reasonable to think of the shooting sequence as effectively becoming the accuracy structure.


Accuracy Check 3: Can the Scale and Coordinate References Be Guaranteed?

The third item to check is the scale and coordinate reference. Because SfM reconstructs shape from the relationships between photos, if no reference is provided it may produce a valid relative 3D shape but cannot guarantee how accurate it is in real-world dimensions. In other words, even if the model looks plausible, its lengths, heights, and positions may not be aligned to standards usable in practice. If you use 3D digitization of cultural heritage for recording or comparison, this point cannot be avoided.


In particular, in cultural heritage surveys it is not uncommon to be asked later to verify dimensions. For comparisons before and after repairs, understanding component dimensions, checking for deformations, and cross-referencing with previous years' data, judgments become ambiguous if the model's scale cannot be trusted. If the goal is only to reproduce appearance, some errors may be acceptable, but considering accountability for survey results, it is desirable for the 3D data to have a defined reference scale.


What becomes important here is the question of which reference standard you use to manage the model. At a minimum, it is necessary to reflect a length reference corresponding to the actual size in the model. Even if you enlarge only part of the subject, if the relationship to the actual size remains ambiguous, misunderstandings can arise when evaluating fine surface damage. Furthermore, if you anticipate comparing data from multiple time points or coordinating with surrounding surveys, you need to consider not only simple scale matching but also positional consistency.


In the field of cultural heritage, site conditions often prevent strict coordinate control. Precisely for that reason, it is important to make clear what level of reference has been provided. Whether a model is relative, has true-scale references, or is tied to site coordinates greatly changes how it can be used. If this is left ambiguous in operation, the meaning of the data can be miscommunicated when shared, leading to confusion in subsequent processes.


Moreover, providing a reference is not the end. After modeling, you must verify that the reference has been properly reflected throughout the entire model. Even if local dimensions match, distortions can accumulate at the global scale. Conversely, even if the overall shape is well formed, local dimensions in critical areas can show deviations. Therefore, what needs to be checked is not just consistency at a single point but both the global and local aspects. For objects such as cultural heritage, which have both value as a whole and value in their details, checking only one or the other is insufficient.


Considering future use, the value of linking 3D models with surrounding information is increasing. To associate not only current-condition records but also repair histories, location information, photographic records, and inspection results, having an organized positional reference makes these easier to handle. For example, if there is an environment that enables efficient on-site confirmation of reference points and acquisition of coordinates, photographic data are less likely to end as standalone deliverables and become easier to use continuously. In such situations, having a means to quickly verify positional references on-site—such as LRTK, a smartphone-mounted GNSS high-precision positioning device—makes it easier to structure the overall recording work around cultural heritage. SfM itself is a photo-based 3D method, but in practice, not separating it from the creation of positional references increases the reusability of the outputs.


Accuracy Check 4: Can it detect missing data, distortions, and noise?

The fourth checkpoint is the verification of missing parts, distortion, and noise. When creating 3D models of cultural heritage using SfM, many people in charge tend to become reassured once the model is displayed smoothly. However, a neat display does not mean the shape is correct. Rather, with complex and delicate objects like cultural heritage, the less conspicuous errors are more likely to cause problems later.


Missing geometry refers to a condition in which surfaces or shapes that should be present are absent from the model. Causes include insufficient image capture, insufficient overlap, occlusions, lack of surface features, and removal settings during processing. For cultural heritage objects, areas prone to missing geometry include deep recesses of intricate carvings, the backs of components, under eaves, and the insides of cracks. If missing areas are present but mistakenly perceived as having been filled in from the surrounding geometry, there is a risk of misinterpreting the shape itself.


Distortion is a condition in which a shape is unnaturally stretched or curved, either overall or locally. It is likely to occur in areas where photographic overlap is weak, when a long surface is followed along a monotonous capture path, or when scale references are lacking. On wall surfaces of cultural heritage buildings, long structural members, rows of stones, or consecutive steps, subtle gradual distortions that are hard to notice at a glance can be introduced. If these are overlooked, errors will arise in assessments of planarity, alignment, and deformation.


Noise can appear as points that seem to float above the surface, rough anomalous shapes, or unnatural protrusions and holes. It is problematic because it can be caused by light reflections, moving shadows, background misrecognition, or degraded image quality, and can easily be confused with the fine unevenness of cultural heritage surfaces. In particular, when interpreting traces of weathering or damage, care must be taken not to mistake noise for actual surface changes.


To discern these, simply looking at a single display mode is insufficient. Both an overview from a three-dimensional display and a close-up view that examines surface continuity are necessary. Also, where possible, it is important to compare with the original images to confirm that the forms that should appear in the photographs are reproduced in the model. In cultural heritage documentation, the post-processed model alone is not the whole truth; a comprehensive evaluation that includes on-site observation and photographic records is required.


Even more important than whether errors exist is understanding where those errors are concentrated. A model can be satisfactory overall while exhibiting reduced accuracy only in critical areas. For example, inscriptions, joints, broken edges, repair boundaries, and surface decorations — the locations you most want to use for decision-making often have difficult imaging conditions and are prone to local errors. Therefore, inspection targets should be defined by prioritizing the areas that are important for the intended use, not by the overall average.


In practice, it is more realistic to clarify how far a model can be trusted and from what point it should be treated as a reference, rather than pursuing a perfect model. In SfM-based 3D digitization of cultural heritage, the important thing is not to eliminate errors but to understand where errors occur and what their nature is, and to use the model appropriately on that basis. If you can correctly detect missing parts, distortions, and noise, you will avoid overestimating the model’s value and be able to decide when re-photography or supplementary documentation is necessary.


Practical considerations when using SfM in cultural heritage surveys

So far we have reviewed four accuracy verification items, but in practice, operational design is also important. When using SfM in cultural heritage surveys, the photography, data processing, and record verification roles are often separated, and if it becomes unclear who made which decision, accuracy management can easily become a mere formality. Therefore, on site it is desirable to share in advance as much as possible the shooting plan, the way standards are set, the checklist items, and the criteria for deciding whether to re-shoot.


What's particularly important is to decide up front what the final deliverable will be used for. Preservation records, comparative research, reference for repair planning, and public visualization each demand different levels of quality. If the intended use differs, the required capture density, the focus of verification, and the strictness of reference control will also change. Nevertheless, if you proceed with "let's just digitize it in 3D for now" while leaving the purpose vague, you'll often discover later that you lack necessary information.


Also, it is safer not to regard SfM as a万能な単独手法. In cultural heritage surveys, it is standard practice to combine multiple recording methods such as photographic documentation, measurement verification, field notes, plan and section drawings, and the organization of location information. Within that set of methods, SfM is a powerful technique that provides three-dimensional reconstruction and high reusability, but it does not necessarily meet all accuracy requirements on its own. In particular, for the management of positions and coordinates, consistency with surrounding conditions, and linking to comparisons with previous years, the overall reliability of the results increases the more on-site reference verification methods are available.


Furthermore, cultural properties are subjects whose on-site conditions are difficult to reproduce. Weather, lighting, the surrounding environment, access restrictions, and the presence or absence of scaffolding are not necessarily the same each time. Therefore, preparation to capture everything in a single session is important. Even if you later realize there was insufficient photography, the burden of revisiting can be greater than at a typical site. By performing a preview check immediately after shooting and establishing a workflow to identify blind spots and missing areas on the spot, you can reduce rework in subsequent processes.


Attention is also required from the perspective of data sharing. Although a 3D model may at first glance appear rich in information and easy for stakeholders to understand, sharing it without explaining the accuracy conditions and the acquisition extent can lead recipients to place undue trust in it. When using it as a record of cultural heritage, it is desirable to make clear which parts are highly reliable and where there are gaps or uncertainties. Conveying not only readability but also the assumptions for its use is important for preserving practical quality.


Summary

SfM is a highly effective method for 3D digitization of cultural heritage. Because it can reconstruct three-dimensional shapes from multiple photographs, it makes it easy to record surface details and spatial relationships that cannot be fully captured by two-dimensional photos, greatly expanding the possibilities for preservation, comparison, sharing, and visualization.


In particular, the ability to review shapes afterward, to more easily track changes across multiple time points, and to record while minimizing physical contact on site are strengths that make it well suited to cultural heritage surveys.


However, how far SfM can go in 3D digitization of cultural heritage is not determined by the name of the method alone. Whether the results are usable in practice depends on carefully checking four items: shooting conditions and image quality; the photo overlap rate and shooting route; scale and coordinate references; and the evaluation of losses, distortions, and noise. Producing a visually tidy 3D model is different from creating records that can withstand research and preservation. If you understand this difference and operate accordingly, SfM can be a powerful tool in the field of cultural heritage.


To make 3D documentation of cultural heritage continuously useful in the field, it is important not only to carry out the 3D digitization itself but also to establish positional reference frameworks and link records with contextual information. In situations where you want to quickly confirm the location of the subject being photographed, streamline the handling of control points and local coordinates, or organize survey records so they are easier to use in downstream processes, LRTK, a smartphone-mounted high-precision GNSS positioning device, is effective. By organizing on-site positional information together with 3D reconstruction via SfM, it becomes easier to further improve the quality and reusability of cultural heritage survey records. If you want digital records of cultural heritage to remain assets that lead into subsequent surveys and conservation/use rather than one-off tasks, it is important to consider such on-site position management measures.


Next Steps:
Explore LRTK Products & Workflows

LRTK helps professionals capture absolute coordinates, create georeferenced point clouds, and streamline surveying and construction workflows. Explore the products below, or contact us for a demo, pricing, or implementation support.

LRTK supercharges field accuracy and efficiency

The LRTK series delivers high-precision GNSS positioning for construction, civil engineering, and surveying, enabling significant reductions in work time and major gains in productivity. It makes it easy to handle everything from design surveys and point-cloud scanning to AR, 3D construction, as-built management, and infrastructure inspection.

bottom of page