6 Basic Steps for Point Cloud Noise Removal, Downsampling, and Classification | How to Preserve Accuracy

By LRTK Team (Lefixea Inc.)

Survey-Grade Accuracy. Smartphone Simplicity.

Capture the Site.
Keep the Coordinates.

See Underground Utilities Before You Dig Again.

Stake Out Faster
with RTK Accuracy

Learn More

• Step 1: Coordinate consolidation of point cloud data and transformation to a reference coordinate system

• Step 2: Removal of measurement noise and outliers

• Step 3: Removal of unnecessary points (filtering to the area of interest)

• Step 4: Downsampling of point cloud data (sampling)

• Step 5: Separation of ground surface and structures (extraction by feature)

• Step 6: Classification of point cloud data and assignment of attribute information

• FAQ (Frequently Asked Questions)

Even when acquiring high-precision point cloud data, left as-is it contains noise and unnecessary points, and the data volume is enormous. If you handle raw data carelessly, analysis can be impaired and processing may become slow and inefficient. That’s why preprocessing such as noise removal, thinning, and classification is important. The key is "to make the data lighter without sacrificing accuracy." By reducing unnecessary points and keeping only the necessary ones, the accuracy and speed of downstream processes improve dramatically. In this article, following the basic 6 steps that are easy even for beginners in point cloud processing, we explain methods and tips for noise removal, thinning, and classification. Please refer to this to make the most of your valuable point cloud data.

Step 1: Coordinate integration of point cloud data and transformation to a reference coordinate system

First, carry out the task of organizing the coordinates of the point cloud data. When multiple scans are taken with a laser scanner or when handling multiple point clouds obtained from drone surveys, you first need registration (alignment) to integrate them into a single coordinate space. By matching overlapping portions of point clouds acquired at each scan position or by using target markers placed on site, you align the point clouds with high precision. This process yields a consistent 3D dataset in which individual point clouds are reconciled and a wide area is covered.

It is also important to convert the acquired point cloud into a coordinate system that serves as a reference in real-world space. By matching it with the survey site’s known points (control points), georeference (geodetic correction) the point cloud to map coordinate systems such as the plane rectangular coordinate system or public coordinate systems. For example, you can assign field-surveyed coordinate values to feature points within the point cloud and transform them, or load a control point file and perform a bulk transformation in software. By taking this step, the point cloud data will reside on the same coordinate basis as design drawings and GIS data, allowing overlay with other survey results.

Tips for maintaining accuracy: Use the most precise references possible for alignment and check the errors in the merged results. When merging multiple point clouds, verify that discrepancies in overlapping areas are within a few millimeters (a few tenths of an in) to a few centimeters (about 0.4–1.2 in) or less. If you used known control points, validate that their coordinates match correctly after merging. By performing coordinate merging and transformations accurately, you can proceed confidently with subsequent noise removal and analysis.

Step 2: Removing Measurement Noise and Outliers

Next is the process of removing noise and outliers from the point cloud data. The point cloud acquired immediately after capture inevitably contains some unwanted points (noise) caused by measurement errors or disturbances in sensor reflections. For example, noise may appear as points where the laser returned from highly reflective materials such as glass or water to the wrong position, points scattered outside the measurement range, or isolated points caused by device malfunctions. Also, a single point located far from its surroundings is called an outlier, and is highly likely to be a data point that does not actually exist.

Such noise points and outliers interfere with data analysis if left as is, so they are removed using the dedicated software's filter function. A common method is outlier removal by statistical filtering. This process treats a point as noise and deletes it when there are not at least a specified number of other points in its neighborhood. For example, you can set a criterion such as “remove points that have an extremely small number of adjacent points within a radius of ○ m (○ ft).” This allows only unnatural points isolated from their surroundings to be automatically detected and removed. In addition, setting a distance threshold to filter out points that are clearly too far away (unexpected spikes) is also effective.

Tips for this task: After applying a noise-removal filter, try to visually inspect the data. Automatic filters are convenient, but if the parameters are too strict they may remove points that are actually needed. Pay special attention to points along the edges of structures and in fine details, which can be mistakenly removed simply because they are 'sparse in the surrounding area'. To choose appropriate parameters, it is reassuring to overlay the point clouds before and after filtering to compare them and to check for any unnatural gaps. Recently, software with AI-powered automatic noise detection features has also appeared. Because AI can identify and remove noisy points based on trained models even without expert knowledge, beginners can now perform noise cleaning more efficiently.

Step 3: Removal of Unnecessary Points (Narrowing the Target Area)

Once noise has been removed, the next step is to remove point cloud data that are unnecessary for analysis to streamline the dataset. Point cloud surveys often capture objects other than the survey target; while these are not noise, they can become unwanted objects depending on the purpose. For example, when creating a topographic map from road point cloud data, point clouds of moving vehicles and pedestrians are unnecessary for terrain analysis. Similarly, for point clouds that record a building’s exterior, point clouds of temporarily placed construction equipment or passersby will interfere with the analysis. Therefore, remove the points corresponding to these off-target elements and organize the data so that only the strictly necessary subjects remain.

Removal of unwanted points is often done by manual editing in dedicated software. Specifically, in a point cloud viewer you select the unwanted parts by region and delete them (or separate them into another file). If you want to remove a car on a road, select and delete the points corresponding to the car’s shape, leaving only the surrounding road surface points. When point cloud density is high and the target object is complex, it is efficient to roughly select the area with rectangles or polygons to erase unwanted items, then remove the finer details point by point. Also, cropping (extracting a range) to narrow down the necessary area is effective. For example, extracting only a specific construction section or the area around a building from a widely scanned point cloud dataset and temporarily excluding other areas makes subsequent processing significantly lighter. Keep the original point cloud file as a backup, and copy only the parts you will use for analysis before processing to be safe.

Note: When removing unwanted points, carefully judge whether those points are truly unnecessary. Even point clouds that seem unrelated at first glance may contain information needed later. For example, after deleting all surrounding trees from a building point cloud you might later want to measure the distance to adjacent trees—but if the trees are no longer present in the data, you cannot perform that measurement. Therefore, it is important to decide based on a clear purpose which data to keep and which to discard. If you are unsure, instead of permanently deleting unwanted objects, you can use layering or save them in a separate file as a backup. That way you can restore them if needed and have peace of mind. At this stage, the point cloud data will be organized into the necessary and sufficient extent and content for the analysis target.

Step 4: Point Cloud Data Thinning (Sampling)

After removing unnecessary points from the point cloud, perform data thinning (sampling) to reduce the total number of points. High-resolution point cloud data can consist of millions to hundreds of millions of points, so left as-is the file size becomes very large and displaying or analyzing it on a computer imposes a heavy load. The purpose of preprocessing is "to make it manageable while preserving accuracy", so it is necessary to devise ways to reduce the total number of points while maintaining the required amount of information.

The basic idea of downsampling is to reduce the spatial density of points uniformly. If you carelessly remove points at random, the carefully captured precise shape representation can become uneven and sparse, causing parts of the shape to become coarse or be overlooked. A representative method for this is grid-based sampling. This involves overlaying a three-dimensional grid of a fixed size (cubic voxels) over the point cloud space, leaving only one representative point in each grid cell and deleting the others. For example, by sampling “one per 1 cm (0.4 in) grid,” you can uniformize the point cloud density to approximately 1 cm (0.4 in) spacing in each direction. Such evenly spaced sampling is well suited to reducing point counts without significantly degrading the overall shape. You can also consider, instead of uniform sampling, varying the amount of thinning according to the size of the object. This means keeping important details (for example, small equipment parts) at high density, while aggressively thinning wide, flat ground areas. However, since this becomes more advanced editing, starting with uniform grid sampling is usually sufficient.

How to decide the thinning rate: The appropriate sampling interval depends on the required precision and the intended use. For example, when measuring the dimensions of a structure that requires millimeter-level precision (a few tenths of an inch), intervals of 5 mm (0.20 in) or less are desirable; but if you only need to capture the overall shape of terrain, several-centimeter intervals (several in) may be sufficient. In general, thinning is done aiming for the minimum point density that sufficiently meets the required precision. When doing this for the first time, it is advisable to proceed in stages. First, try thinning the point cloud to 50%; if precision is acceptable, thin further to 25%… and adjust while monitoring. Before and after the operation, check how file size and display responsiveness change, and ideally reduce until you reach the smallest size that can be handled without issues.

Also, when point cloud data is unavoidably large and heavy, rather than forcibly downsampling the entire dataset, you can split the data by area (tiling). By managing each area as multiple files, you can keep the size of each file down. This makes it possible to improve usability without sacrificing important detail. Downsampling is only a means of optimization and not the act of "cutting data". Maintaining a sense of balance—reducing size while preserving important information—is the key to lightweighting data without degrading accuracy.

Step 5: Separation of the ground surface and structures (extraction by feature)

Once the point cloud has been consolidated to an appropriate density, the next step is to separate the ground surface (terrain) from man-made objects and other parts. This process is also referred to as the primary classification of point cloud data. In fields such as civil surveying and terrain analysis, understanding the shape of the ground is particularly important, so only the ground points that are covered by buildings, trees, etc., and therefore not visible, are extracted and managed separately. By doing this, analyses using the ground surface point cloud—such as drawing contour lines or calculating earthwork volumes—can be performed smoothly.

For separating the ground surface, it is common to use a dedicated software's ground surface extraction filter. Many point cloud processing software packages include a function to "automatically identify only ground points." This is an algorithm that, based on the vertical distribution of heights, point density, and continuity from lower to upper layers, estimates the point group forming the lowest layer as the ground. For example, methods such as the progressive (filtering) processing that treats the lowest point within a given radius as the ground and incorporates it, and methods that extract terrain-like surfaces from changes in slope gradient, are known.

By simply applying automatic processing, you can obtain a point cloud that is roughly classified into ground and non-ground (buildings, trees, vehicles, etc.).

In simpler cases, it is also possible to separate by a height threshold. For example, a method is to put all low points within 1 m (3.3 ft) of the ground into the ground category, and classify those higher than that into the structure category. However, because terrain may be highly variable and there may be building foundations and the like, making a uniform judgment based solely on height can easily lead to misclassification. Basically, the reliable procedure is to use a dedicated filter for automatic extraction and then visually inspect and correct the results. After automatic extraction, inspect the point cloud and manually remove points that are clearly not ground (for example, a low vehicle touching the ground that has been mixed into the ground), or conversely add missing ground points to adjust.

Benefits: Separating the ground from everything else streamlines downstream analysis. For example, you can immediately generate a gridded elevation model (DSM/DTM) from the ground point cloud, and by visualizing only the structure point cloud you can concentrate on understanding building shapes. Also, peripheral elements that were left during the unwanted-object removal step can be separated together with the ground surface so they can be handled separately as needed. The important thing is grouping the point cloud by real-world categories at this stage. Ground data in particular is valuable as foundational information, so extracting it accurately will prove useful later.

Step 6: Classification of Point Cloud Data and Adding Attribute Information

The final task is to perform a detailed classification (class labeling) of the point cloud data. In Step 5 we broadly separated the ground surface and others, but here we further subdivide the contents of "others." Typical classification examples in the construction field include classes such as ground surface, buildings, structures (civil works), trees and vegetation, vehicles, and power lines. In the LAS data standard for airborne LiDAR surveying, classification codes are defined for ground (ground), buildings, low vegetation, high vegetation, water surfaces, and so on. By classifying into appropriate categories according to the intended use, the point cloud data is given semantic meaning, making analysis and practical use significantly easier.

Point cloud classification work is basically handled by automated processing + manual correction as needed. In recent years, some point cloud processing software now include features for automatic class classification that apply AI techniques and image recognition. For example, there are tools that use pre-trained models to identify patterns in the point cloud and batch-classify them as “this is a tree, this is a building.” By combining attribute information (RGB values and reflectance intensity), cases where automatic classification can be performed with high accuracy have become more common. However, they are not perfect, and in complex environments misclassification can occur. Therefore, it is important to always visually inspect the results of automatic classification and correct any obvious mistakes. Some software displays point clouds color-coded by class, allowing you to spot unnatural mixing at a glance (for example, a portion of a tree mixed into the building class). Manually correct those parts and finalize a clean classified dataset.

After classification is complete, point cloud data will have an "attribute code" assigned to each point. This adds a dimension of information to data that was merely a collection of three-dimensional points. Using classification codes, you can analyze the same point cloud from various angles—for example, "extract only ground points to perform terrain modeling", "display only building points in a different color", or "estimate the volume of tree points". Also, when integrating point clouds with CAD models or GIS data, having class information enables smooth filtering and symbolized display. From the perspective of preservation of accuracy, the classification process itself does not reduce the coordinate accuracy of the point cloud. However, incorrect classification carries the risk of misinterpreting analysis results (for example: something you thought was a building might actually be a tree). Therefore, it is reassuring to verify that classification results are reliable by referring to on-site observations and photographs.

By following these six basic steps, from noise removal to classification, point cloud data is transformed into a lightweight, easy-to-use state that preserves accuracy. Point clouds with unnecessary data stripped away and classification information added will be ideal material for subsequent 3D analysis and model creation.

Make on-site measurement easier: So far we have discussed processing point cloud data, but there is also the underlying need to "easily acquire high-precision point clouds." Traditionally this required expensive laser scanners and specialized expertise, but in recent years new solutions that enable simplified surveying have emerged. A representative example is the LRTK series. For example, by using the compact GNSS receiver device "LRTK Phone" that attaches to a smartphone, anyone can easily begin 3D point cloud measurement. Because point clouds obtained with a smartphone’s camera and sensors can be given centimeter-level position information (half-inch-level), a quick on-site scan alone can yield point cloud data with the accuracy of a survey map. Distances, areas, and volumes can also be measured in real time on a dedicated app, allowing you to complete everything from acquisition to utilization on-site. By incorporating this kind of simplified surveying with LRTK, recording small-scale sites and day-to-day progress management can be dramatically streamlined. As a new workflow that enables speedy measurement and processing while maintaining high accuracy, LRTK should become a strong ally for on-site DX.

FAQ (Frequently Asked Questions)

Q. Why is it necessary to remove noise from point cloud data? A. When measuring point clouds, points that do not actually exist (noise) inevitably become mixed in due to sensor errors and environmental factors. If this noise is left as-is, it can introduce errors into analysis results and needlessly lengthen processing time. By removing noise, the purity and reliability of the data are improved, and the accuracy of subsequent analyses is increased. Consider it a preparatory step to leave only the information you need.

Q. Won't thinning the point cloud reduce accuracy? A. If thinning is done appropriately, the impact on accuracy can be minimized. Thinning uniformly reduces the overall point density, so you can decrease data volume while preserving the shape's features as much as possible. However, because if you thin too much, detailed information will be lost, it is important to determine the point spacing according to the required accuracy. If you are concerned, keep the original point cloud data and, if necessary, readjust the density to avoid the risk of accuracy degradation.

Q. Can classification of point cloud data be automated? A. Yes, with recent advances in AI technology, a certain level of automated classification has become possible. Some dedicated software can, with one click, have AI automatically classify points into ground, buildings, vegetation, and so on. However, automated classification results are not necessarily perfect, so a final quality check is required. After rough classification by automated processing, many systems provide a way to correct errors on-screen by checking color-coded displays even without specialist knowledge. By combining human judgment with automation, efficient and accurate classification can be achieved.

Q. How should a beginner start point cloud processing? A. First, it's a good idea to get familiar with basic operations (rotating the viewpoint, sectional display, distance measurement, etc.) using free point cloud viewers. After that, try relatively simple noise removal and sampling to experience how they make the data easier to handle. Recently, there has been an increase in point cloud processing software and cloud services with UIs that are easy to use in Japanese, so even beginners can operate them intuitively. Also, measurement methods that are easy to start with, such as point cloud measurement using a smartphone plus a simple device, have emerged. For example, by using solutions like LRTK, even first-timers can acquire high-precision point clouds on-site and immediately put them to practical use. Point cloud processing may look difficult, but with the right tools the barrier is not high. Start by practicing on a small area and step up as you gain experience.

Next Steps:
Explore LRTK Products & Workflows

LRTK helps professionals capture absolute coordinates, create georeferenced point clouds, and streamline surveying and construction workflows. Explore the products below, or contact us for a demo, pricing, or implementation support.

LRTK supercharges field accuracy and efficiency

The LRTK series delivers high-precision GNSS positioning for construction, civil engineering, and surveying, enabling significant reductions in work time and major gains in productivity. It makes it easy to handle everything from design surveys and point-cloud scanning to AR, 3D construction, as-built management, and infrastructure inspection.

Survey-Grade 3D Scanning with Absolute Coordinates