7 Steps to Create Gaussian Splatting｜Beginner's Guide

By LRTK Team (Lefixea Inc.)

Survey-Grade Accuracy. Smartphone Simplicity.

See Underground Utilities Before You Dig Again.

Gaussian splatting has rapidly gained prominence as a technique for producing 3D representations from photographs that closely resemble reality. Unlike the traditional ideas of “showing point clouds” or “applying a mesh,” it places many Gaussians in space and optimizes their appearance, which makes it easier to achieve a natural look including light wrap and material feel. The original paper and subsequent surveys emphasize that an explicit scene representation using 3D Gaussians combined with differentiable rendering enables both high-quality novel view synthesis and fast display.

On the other hand, many practitioners searching for practical guidance are likely more concerned with “where to start,” “how to conduct the shoot,” “where failures commonly occur,” and “how to use the finished product in the field” than with the theory itself. Therefore, this article breaks the overall workflow into seven steps—from preparation to publication—so that beginners won’t lose their way, and organizes the process from a practical perspective. The explanation focuses on ideas that are applicable across environments rather than depending on specific products or software names.

• Understand the big picture of Gaussian splatting

• Step 1 Decide goals and use cases first

• Step 2 Finalize the shooting plan

• Step 3 Capture images

• Step 4 Estimate camera poses and organize inputs

• Step 5 Train Gaussians and build the model

• Step 6 Check for artifacts and readjust

• Step 7 Prepare publishing and practical deployment

• Summary

Understand the big picture of Gaussian splatting

The first thing to grasp is that Gaussian splatting is not “just a filter that makes photos look 3D.” The basic flow is to estimate camera poses and a sparse point cloud from multiple photos or video frames, use that sparse information as a starting point to optimize a large number of 3D Gaussians, and render them while accounting for visibility. The original implementation publicly released also clearly shows a pipeline that takes pose-aligned results from multiple images as input and generates a 3D Gaussian model.

This understanding is important because many failures stem from earlier stages—shooting and pose alignment—rather than from the learning process itself. If you focus your attention solely on the learning step expecting visually pleasing results, you may miss earlier issues such as poor shooting paths, insufficient overlap, surfaces lacking features, or many moving objects. In other words, creating Gaussian splatting is not merely a sequence of “press the training button” steps but a design process that includes how to collect inputs that are easy to reconstruct.

Step 1 Decide goals and use cases first

What beginners should do first is not necessarily increase the number of shots or immediately try the highest resolution. First clarify why you are making it. Whether you need a 3D view for internal sharing, want photorealism for client presentations, need to preserve spatial atmosphere for remote inspection, or plan to use it for dimensional or positional checks will change the optimal shooting method and evaluation criteria. While 3D Gaussian representations excel at visual fidelity and fast rendering, research has shown that when strict geometric accuracy is required, combining traditional alignment, point cloud, and reference coordinate management approaches can be effective.

It is especially important not to confuse “models for viewing” with “models for measuring.” For viewing, some geometric error is acceptable if the overall appearance holds together and viewpoint transitions feel smooth. Conversely, if you aim for measurement or positioning use cases, you must plan during shooting how to place references, provide scale, connect to coordinate systems, and overlay other data, or you will face difficulties later. Beginners are less likely to fail if they focus their first attempt on “producing an attractive model suitable for sharing and review,” then expand to positioning and measurement use cases afterward.

Step 2 Finalize the shooting plan

Once the goal is decided, the next step is the shooting plan. Going to the site with an ambiguous plan often leads to typical failures such as insufficient image count, missing faces you wanted to show, or a noisy background that buries the subject. Image-based 3D reconstruction is sensitive to capture conditions: reflective surfaces, transparent surfaces, and featureless plain surfaces tend to make feature matching unstable, causing noise, holes, and distortion. Moreover, insufficient overlap between captures increases reconstruction holes, so beginners should design capture paths with generous overlap so adjacent images overlap sufficiently. Research shows that, depending on the use case, a higher-than-usual overlap rate can be effective. isprs-archives.copernicus.org

In practice, it’s effective to plan passes at different heights from the start rather than only circling the subject at a single height. Shooting only near the frontal direction will leave top and recessed areas uncovered, and without information from below the model can lack a sense of volume. Mixing close-up detail shots with wider shots that show the overall relationship helps capture both visual detail and global consistency. Simply writing down on paper “what you want to show” and “what is likely to be hidden” before shooting will significantly reduce hesitation on site.

Step 3 Capture images

At the shoot, the top priority is to consistently collect images that make pose estimation easy later. Gaussian splatting can start from multiple images or video, but for beginners aiming to succeed on the first project, it’s easier to start mainly with still images. Although the original paper assumes novel view synthesis from multiple photos or video, publicly released implementations ultimately take a pose-aligned set of images and a sparse reconstruction as input, so poor original image quality is hard to recover later. Particular pitfalls are camera shake, subject motion blur, extreme exposure differences, and inconsistent focus, which have a greater negative effect than they might seem.

When shooting, it is more important to collect images continuously according to a consistent rule than to aim for perfection in each frame. Too much movement between frames weakens connections between images, while too many similar compositions reduce depth variation. Strictly following basics—don’t change distance abruptly, don’t jump orientations, smoothly orbit around the subject—will greatly improve the stability of later steps. If there are moving people, vehicles, or swaying vegetation, choose a static moment if possible; for unavoidable moving elements, consider capturing those areas separately or shoot with extra overlap so they can be excluded later if needed.

Another easy-to-overlook point for beginners is background handling. If you tightly crop only the subject, you may lack the surrounding information needed for alignment. Conversely, too much background makes the main subject ambiguous. The ideal is to keep the subject as the main focus of the frame while including surrounding patterns or shapes that provide alignment cues. Make a habit of reviewing a few images on site right after capture to check for blur, blown highlights, crushed shadows, and insufficient overlap—this reduces the risk of needing to revisit the site.

Step 4 Estimate camera poses and organize inputs

After collecting images, the next step is camera pose estimation and input organization. Here you estimate where each photo was taken from and reconstruct the spatial relationships among the images. The publicly released original implementation assumes an image directory and a data structure containing sparse camera information, and shows conversion steps to prepare undistorted images and alignment information. In other words, Gaussian splatting relies on preparing inputs in a pose-alignment-friendly format rather than simply throwing in raw images and expecting everything to be automatically fixed.

Many beginners stumble here due to insufficient selection and organization of images. Even if photos look similar, mixing ones with heavy blur, large exposure errors, images that only include half the subject, or repeated passes that scramble order can destabilize alignment. Using all images does not necessarily improve accuracy. Instead, keeping only well-connected images and removing clearly bad ones upfront tends to produce a cleaner sparse reconstruction. At this stage, prioritize creating a coherent dataset over trying to include every single image for perfection.

Additionally, organizing scale and reference strategies here makes later steps easier. For purely visual purposes, relative scale may suffice in some cases, but if you plan to use the model for field explanations or to overlay other data, decide in advance when and to what reference you will align positions. In fields like architecture, civil engineering, and facilities management where positional information has value, plan the operation so visualization models and positioning references are connectable from the start to reduce rework.

Step 5 Train Gaussians and build the model

Once you have pose-aligned inputs, it’s time to train the Gaussians. Starting from a sparse point cloud, you optimize a large number of 3D Gaussians. The original paper uses sparse points obtained from camera calibration as initial values, optimizes anisotropic Gaussians with density control, and balances training and display using fast visibility-aware rendering. This explains why Gaussian splatting makes it relatively easy to achieve both a sense of visual density and snappy rendering.

However, beginners should not aim for the highest quality from the start. Public implementations assume a GPU-based environment and relatively large memory for high-quality training, and raising evaluation quality can be resource-intensive. The same materials also show that reducing the number of iterations or relaxing density control conditions can lower memory requirements, and that you can test with reduced input resolution. In short, run a lighter configuration through to the end once initially, identify where failures occur, and then revisit shooting or settings—this saves time and compute resources.

Keep in mind that training is not magic and can amplify problems in the inputs. Surfaces not visible in the images cannot be reconstructed, and areas with weak overlap appear as blurring or stretching. Therefore, treat the goal of the first training run less as “producing a finished product” and more as “visualizing where data is lacking.” From that perspective, the first result is not a failure but a diagnostic to inform a more precise second shooting.

Step 6 Check for artifacts and readjust

After training, the step that deserves the most time is verification. Beginners tend to be satisfied once something is generated, but whether it is usable in practice is decided here. In verification, prioritize checking for failures when moving the viewpoint rather than whether a single still image looks pretty. Common issues include softened contours, unnatural depth relationships, stretching into thin plank-like forms, blending of background and subject boundaries, and failures on reflective or transparent surfaces. Image-based reconstruction is highly affected by capture conditions, and especially reflective surfaces, transparent surfaces, and feature-poor surfaces are prone to noise and geometric distortion. isprs-archives.copernicus.org

Importantly, before blaming settings, translate problems into “what was missing in the inputs.” If the top surface collapses, you may have lacked shots from above; if corners are ambiguous, you may have missed oblique cuts; if a specific face is unstable, its glossiness or lack of texture may be the cause. In verification, it’s crucial to be able to convert observed failures into concrete re-shooting instructions. Once you can do that, the success rate for subsequent attempts rises dramatically.

If the use case goes beyond visualization for sharing, also verify against other criteria. For example, check whether the model aligns acceptably with existing point clouds, drawings, reference points, or known dimensions. Recent research suggests integrating Gaussian splatting for high-quality visualization while supporting geometric accuracy in a separate framework. Since a model that only looks good tends to stall at the field deployment stage, always define verification axes appropriate to the intended use. GDMC

Step 7 Prepare publishing and practical deployment

The final step is to prepare how the finished model will be used. Gaussian splatting provides more value when you design who will view it, on which devices, and in which situations, rather than just making it and stopping. Survey papers also emphasize that due to its fast rendering and editability, this technique easily extends to immersive experiences, interactive browsing, and various media applications. In other words, the deliverable should be designed as “an experience that accelerates site understanding” rather than merely as a 3D model, which makes internal adoption and explanations easier.

Practitioners should think about role allocation for 3D representations. For instance, in meetings or remote inspections, realistic appearance that conveys spatial context is valuable. For construction management, as-built verification, layout positioning, and coordinate checks, appearance alone is insufficient. Those use cases require operational flows that connect point clouds, reference coordinates, positioning data, and drawings. Therefore, do not treat Gaussian splatting as an end product in isolation; design operations so it can interoperate with other practical data.

This approach is especially effective in domains where site coordinates matter, such as construction, civil engineering, and facilities management. For example, use Gaussian splatting to quickly convey spatial atmosphere and shape for sharing, and use LRTK for on-site checks and simple surveying—this role division naturally connects the 3D visualization for viewing with the positional information used in the field. In situations where you want to confirm real-world positions while viewing a 3D model, combining visualization with a high-precision GNSS positioning device like the iPhone-mounted LRTK helps treat visualization and positioning not as separate worlds but as a single practical workflow.

Summary

For beginners, the way to create Gaussian splatting can be organized around seven steps, and the key to success lies less in fine differences in training settings than in carefully iterating four areas: goal setting, shooting planning, input organization, and verification. Technically, its main appeal is optimizing many 3D Gaussians from sparse alignment results to enable fast, high-quality display. That is why for a first project it is more valuable to understand which inputs are stable and which conditions cause failures than to aim for a perfect final piece.

What really matters in practice is not just producing a pretty 3D representation but how you connect that output to sharing, verification, explanation, measurement, and on-site response. If you want to convey sites and objects intuitively with Gaussian splatting while linking through to positional checks and simple surveys, combining it with a positioning foundation like LRTK makes it easier to balance visual clarity and field-operational certainty. The idea of linking “3D for showing” with “coordinates for using” will become increasingly important for practical deployment going forward.

Next Steps:
Explore LRTK Products & Workflows

LRTK helps professionals capture absolute coordinates, create georeferenced point clouds, and streamline surveying and construction workflows. Explore the products below, or contact us for a demo, pricing, or implementation support.

LRTK supercharges field accuracy and efficiency

The LRTK series delivers high-precision GNSS positioning for construction, civil engineering, and surveying, enabling significant reductions in work time and major gains in productivity. It makes it easy to handle everything from design surveys and point-cloud scanning to AR, 3D construction, as-built management, and infrastructure inspection.

Survey-Grade 3D Scanning with Absolute Coordinates

Stake Out Faster
with RTK Accuracy

Calculate Cut & Fill Volumes from Design vs Reality

Overlay DXF Plans in AR
with RTK-Stable Alignment

Learn More