upd

update
local_only: debug
2024-10-29 17:09:36 +00:00 · 2024-10-29 16:59:03 +00:00 · 2024-10-29 16:54:42 +00:00 · 2024-10-29 12:40:06 +00:00 · 2024-10-29 12:39:06 +00:00 · 2024-10-29 12:34:37 +00:00
24 changed files with 994 additions and 426 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -11,4 +11,5 @@ test/
 *.log
 /data_generation/data/*
 /data_generation/output/*
-test/
+test/
 temp*
--- a/Readme.md
+++ b/Readme.md
@@ -0,0 +1,192 @@
 # Next Best View for Reconstruction
 ## 1. Setup Environment
 ### 1.1 Install Main Project
 ```bash
 mkdir nbv_rec
 cd nbv_rec
 git clone https://git.hofee.top/hofee/nbv_reconstruction.git
 ```
 ### 1.2 Install PytorchBoot
 the environment is based on PytorchBoot, clone and install it from [PytorchBoot](https://git.hofee.top/hofee/PyTorchBoot.git)
 ```bash
 git clone https://git.hofee.top/hofee/PyTorchBoot.git
 cd PyTorchBoot
 pip install .
 cd ..
 ```
 ### 1.3 Install Blender (Optional)
 If you want to render your own dataset as described in [section 2. Render Datasets](#2-render-datasets), you'll need to install Blender version 4.0 from [Blender Release](https://download.blender.org/release/Blender4.0/). Here is an example of installing Blender on Ubuntu:
 ```bash
 wget https://download.blender.org/release/Blender4.0/blender-4.0.2-linux-x64.tar.xz
 tar -xvf blender-4.0.2-linux-x64.tar.xz
 ```
 If blender is not in your PATH, you can add it by:
 ```bash
 export PATH=$PATH:/path/to/blender/blender-4.0.2-linux-x64
 ```
 To run the blender script, you need to install the `pyyaml` and `scipy` package into your blender python environment. Run the following command to print the python path of your blender:
 ```bash
 ./blender -b --python-expr "import sys; print(sys.executable)"
 ```
 Then copy the python path `/path/to/blender_python` shown in the output and run the following command to install the packages:
 ```bash
 /path/to/blender_python -m pip install pyyaml scipy
 ```
 ### 1.4 Install Blender Render Script (Optional)
 Clone the script from [nbv_rec_blender_render](https://git.hofee.top/hofee/nbv_rec_blender_render.git) and rename it to `blender`:
 ```bash
 git clone https://git.hofee.top/hofee/nbv_rec_blender_render.git
 mv nbv_rec_blender_render blender
 ```
 ### 1.5 Check Dependencies
 Switch to the project root directory and run `pytorch-boot scan` or `ptb scan` to check if all dependencies are installed:
 ```bash
 cd nbv_reconstruction
 pytorch-boot scan
 # or
 ptb scan
 ```
 If you see project structure information in the output, it means all dependencies are correctly installed. Otherwise, you may need to run `pip install xxx` to install the missing packages.
 ## 2. Render Datasets (Optional)
 ### 2.1 Download Object Mesh Models
 Download the mesh models divided into three parts from: 
 - [object_meshes_part1.zip](None)
 - [object_meshes_part2.zip](https://pan.baidu.com/s/1pBPhrFtBwEGp1g4vwsLIxA?pwd=1234)
 - [object_meshes_part3.zip](https://pan.baidu.com/s/1peE8HqFFL0qNFhM5OC69gA?pwd=1234)
 or download the whole dataset from [object_meshes.zip](https://pan.baidu.com/s/1ilWWgzg_l7_pPBv64eSgzA?pwd=1234)
 Download the table model from [table.obj](https://pan.baidu.com/s/1sjjiID25Es_kmcdUIjU_Dw?pwd=1234)
 ### 2.2 Set Render Configurations
 Open file `configs/local/view_generate_config.yaml` and modify the parameters to fit your needs. You are required to at least set the following parameters in `runner-generate`:
 - `object_dir`: the directory of the downloaded object mesh models
 - `output_dir`: the directory to save the rendered dataset
 - `table_model_path`: the path of the downloaded table model
 ### 2.3 Render Dataset
 There are two ways to render the dataset:
 #### 2.3.1 Render with Visual Monitoring
 If you want to visually monitor the rendering progress and machine resource usage:
 1. In the terminal, run:
   ```
   ptb ui
   ```
 2. Open your browser and visit http://localhost:5000
 3. Navigate to `Project Dashboard - Project Structure - Applications - generate_view`
 4. Click the `Run` button to execute the rendering script
 #### 2.3.2 Render in Terminal
 If you don't need visual monitoring and prefer to run the rendering process directly in the terminal, simply run:
 ```
 ptb run generate_view
 ```
 This command will start the rendering process without launching the UI.
 ## 3. Preprocess
 ⚠️ The preprocessing code is currently not managed by `PytorchBoot`. To run the preprocessing:
 1. Open the `./preprocess/preprocessor.py` file.
 2. Locate the `if __name__ == "__main__":` block at the bottom of the file.
 3. Specify the dataset folder by setting `root = "path/to/your/dataset"`.
 4. Run the preprocessing script directly:
   ```
   python ./preprocess/preprocessor.py
   ```
 This will preprocess the data in the specified dataset folder.
 ## 4. Generate Strategy Label
 ### 4.1 Set Configuration
 Open the file `configs/local/strategy_generate_config.yaml` and modify the parameters to fit your needs. You are required to at least set the following parameter:
 - `datasets.OmniObject3d.root_dir`: the directory of your dataset
 ### 4.2 Generate Strategy Label
 There are two ways to generate the strategy label:
 #### 4.2.1 Generate with Visual Monitoring
 If you want to visually monitor the generation progress and machine resource usage:
 1. In the terminal, run:
   ```
   ptb ui
   ```
 2. Open your browser and visit http://localhost:5000
 3. Navigate to Project Dashboard - Project Structure - Applications - generate_strategy
 4. Click the `Run` button to execute the generation script
 #### 4.2.2 Generate in Terminal
 If you don't need visual monitoring and prefer to run the generation process directly in the terminal, simply run:
 ```
 ptb run generate_strategy
 ```
 This command will start the strategy label generation process without launching the UI.
 ## 5. Train
 ### 5.1 Set Configuration
 Open the file `configs/local/train_config.yaml` and modify the parameters to fit your needs. You are required to at least set the following parameters in the `experiment` section:
 ```yaml
 experiment:
  name: your_experiment_name
  root_dir: path/to/your/experiment_dir
  use_checkpoint: False # if True, the checkpoint will be loaded
  epoch: 600 # specific epoch to load, -1 stands for last epoch
  max_epochs: 5000 # maximum epochs to train
  save_checkpoint_interval: 1 # save checkpoint interval
  test_first: True # if True, test process will be performed before training at each epoch
 ```
 Adjust these parameters according to your training requirements.
 ### 5.2 Start Training
 There are two ways to start the training process:
 #### 5.2.1 Train with Visual Monitoring
 If you want to visually monitor the training progress and machine resource usage:
 1. In the terminal, run:
   ```
   ptb ui
   ```
 2. Open your browser and visit http://localhost:5000
 3. Navigate to Project Dashboard - Project Structure - Applications - train
 4. Click the `Run` button to start the training process
 #### 5.2.2 Train in Terminal
 If you don't need visual monitoring and prefer to run the training process directly in the terminal, simply run:
 ```
 ptb run train
 ```
 This command will start the training process without launching the UI.
 ## 6. Evaluation
 ...
--- a/TODO.md
+++ b/TODO.md
@@ -1,22 +0,0 @@
 # TODO
 ## 预处理数据
 ### 1. 生成view阶段
 **input**: 物体mesh
 ### 2. 生成label阶段
 **input**: 目标物体点云、目标物体点云法线、桌面扫描点、被拍到的桌面扫描点
 **可以删掉的数据**: mask、normal
 ### 3. 训练阶段
 **input**: 完整点云、pose、label
 **可以删掉的数据**：depth
 ### view生成后
 预处理目标物体点云、目标物体点云法线、桌面扫描点、被拍到的桌面扫描点、完整点云
 删除depth、mask、normal
 ### label生成后
 只上传：完整点云、pose、label
--- a/app_split.py
+++ b/app_split.py
@@ -5,5 +5,5 @@ from runners.data_spliter import DataSpliter
 class DataSplitApp:
    @staticmethod
    def start():
-        DataSpliter("configs/server/split_dataset_config.yaml").run()
+        DataSpliter("configs/server/server_split_dataset_config.yaml").run()
--- a/configs/local/strategy_generate_config.yaml
+++ b/configs/local/strategy_generate_config.yaml
@@ -12,26 +12,16 @@ runner:
  generate:
    voxel_threshold: 0.003
-    soft_overlap_threshold: 0.3
+    overlap_area_threshold: 30
-    hard_overlap_threshold: 0.6
+    compute_with_normal: False
-    filter_degree: 75
+    scan_points_threshold: 10
    to_specified_dir: True # if True, output_dir is used, otherwise, root_dir is used
    save_points: True
    load_points: True
    save_best_combined_points: False
    save_mesh: True
    overwrite: False
-    seq_num: 15
+    seq_num: 10
    dataset_list:
      - OmniObject3d
 datasets:
    OmniObject3d:
-      #"/media/hofee/data/data/temp_output"
+      root_dir: /data/hofee/nbv_rec_part2_preprocessed
-      root_dir: /media/hofee/repository/full_data_output 
+      from: 155
-      model_dir: /media/hofee/data/data/scaled_object_meshes
+      to: 165 # ..-1 means end
      from: 0
      to: -1 # -1 means end
      #output_dir: "/media/hofee/data/data/label_output"
--- a/configs/local/train_config.yaml
+++ b/configs/local/train_config.yaml
@@ -84,7 +84,7 @@ module:
  gf_view_finder:
    t_feat_dim: 128
    pose_feat_dim: 256
-    main_feat_dim: 2048
+    main_feat_dim: 3072
    regression_head: Rx_Ry_and_T
    pose_mode: rot_matrix
    per_point_feature: False
--- a/configs/local/view_generate_config.yaml
+++ b/configs/local/view_generate_config.yaml
@@ -7,17 +7,17 @@ runner:
    name: debug
    root_dir: experiments
  generate:
-    port: 5004
+    port: 5002
-    from: 0
+    from: 600
-    to: 1 # -1 means all
+    to: -1 # -1 means all
-    object_dir: /media/hofee/data/data/box_object_meshes
+    object_dir: /media/hofee/data/data/object_meshes_part1
-    table_model_path: /media/hofee/data/data/others/table.obj
+    table_model_path: "/media/hofee/data/data/others/table.obj"
-    output_dir: /media/hofee/data/data/box_views
+    output_dir: /media/hofee/repository/data_part_1
    binocular_vision: true
    plane_size: 10
    max_views: 512
    min_views: 128
-    random_view_ratio: 0.2
+    random_view_ratio: 0.02
    min_cam_table_included_degree: 20
    max_diag: 0.7
    min_diag: 0.01
--- a/configs/server/server_split_dataset_config.yaml
+++ b/configs/server/server_split_dataset_config.yaml
@@ -10,13 +10,13 @@ runner:
    root_dir: "experiments"
  split: # 
-    root_dir: "/home/data/hofee/project/nbv_rec/data/nbv_rec_data_512_preproc_npy"
+    root_dir: "/data/hofee/data/packed_preprocessed_data"
    type: "unseen_instance" # "unseen_category"
    datasets:
      OmniObject3d_train: 
-        path: "../data/sample_for_training_preprocessed/OmniObject3d_train.txt"
+        path: "/data/hofee/data/OmniObject3d_train.txt"
        ratio: 0.9
      OmniObject3d_test: 
-        path: "../data/sample_for_training_preprocessed/OmniObject3d_test.txt"
+        path: "/data/hofee/data/OmniObject3d_test.txt"
        ratio: 0.1
--- a/configs/server/server_strategy_generate_config.yaml
+++ b/configs/server/server_strategy_generate_config.yaml
@@ -1,32 +0,0 @@
 runner:
  general:
    seed: 0
    device: cpu
    cuda_visible_devices: "0,1,2,3,4,5,6,7"
  experiment:
    name: debug
    root_dir: "experiments"
  generate:
    voxel_threshold: 0.005
    overlap_threshold: 0.5
    filter_degree: 75
    to_specified_dir: False # if True, output_dir is used, otherwise, root_dir is used
    save_points: True
    save_best_combined_points: True
    save_mesh: True
    overwrite: False
    dataset_list:
      - OmniObject3d
 datasets:
    OmniObject3d:
      #"/media/hofee/data/data/temp_output"
      root_dir: "../data/sample_for_training/scenes" 
      model_dir: "../data/scaled_object_meshes"
      #output_dir: "/media/hofee/data/data/label_output"
--- a/configs/server/server_train_config.yaml
+++ b/configs/server/server_train_config.yaml
@@ -7,13 +7,13 @@ runner:
    parallel: False
  experiment:
-    name: full_w_global_feat_wo_local_pts_feat
+    name: overfit_ab_local_only
    root_dir: "experiments"
    use_checkpoint: False
    epoch: -1 # -1 stands for last epoch
    max_epochs: 5000
    save_checkpoint_interval: 1
-    test_first: True  
+    test_first: False  
  train:
    optimizer:
@@ -25,60 +25,60 @@ runner:
  test:
    frequency: 3 # test frequency
    dataset_list:
-      - OmniObject3d_test
+      #- OmniObject3d_test
      - OmniObject3d_val
-  pipeline: nbv_reconstruction_global_pts_pipeline
+  pipeline: nbv_reconstruction_pipeline
 dataset:
  OmniObject3d_train:
-    root_dir: "/home/data/hofee/project/nbv_rec/data/nbv_rec_data_512_preproc_npy"
+    root_dir: "/data/hofee/nbv_rec_part2_preprocessed"
    model_dir: "../data/scaled_object_meshes"
    source: nbv_reconstruction_dataset
-    split_file: "/home/data/hofee/project/nbv_rec/data/OmniObject3d_train.txt"
+    split_file: "/data/hofee/data/sample.txt"
    type: train
    cache: True
    ratio: 1
-    batch_size: 160
+    batch_size: 32
    num_workers: 16
-    pts_num: 4096
+    pts_num: 8192
    load_from_preprocess: True
  OmniObject3d_test:
-    root_dir: "/home/data/hofee/project/nbv_rec/data/nbv_rec_data_512_preproc_npy"
+    root_dir: "/data/hofee/nbv_rec_part2_preprocessed"
    model_dir: "../data/scaled_object_meshes"
    source: nbv_reconstruction_dataset
-    split_file: "/home/data/hofee/project/nbv_rec/data/OmniObject3d_test.txt"
+    split_file: "/data/hofee/data/sample.txt"
    type: test
    cache: True
    filter_degree: 75
    eval_list:
      - pose_diff
-    ratio: 0.05
+    ratio: 1
-    batch_size: 160
+    batch_size: 32
    num_workers: 12
-    pts_num: 4096
+    pts_num: 8192
    load_from_preprocess: True
  OmniObject3d_val:
-    root_dir: "/home/data/hofee/project/nbv_rec/data/nbv_rec_data_512_preproc_npy"
+    root_dir: "/data/hofee/nbv_rec_part2_preprocessed"
    model_dir: "../data/scaled_object_meshes"
    source: nbv_reconstruction_dataset
-    split_file: "/home/data/hofee/project/nbv_rec/data/OmniObject3d_train.txt"
+    split_file: "/data/hofee/data/sample.txt"
    type: test
    cache: True
    filter_degree: 75
    eval_list:
      - pose_diff
-    ratio: 0.005
+    ratio: 1
-    batch_size: 160
+    batch_size: 32
    num_workers: 12
-    pts_num: 4096
+    pts_num: 8192
    load_from_preprocess: True
 pipeline:
-  nbv_reconstruction_local_pts_pipeline:
+  nbv_reconstruction_pipeline:
    modules:
      pts_encoder: pointnet_encoder
      seq_encoder: transformer_seq_encoder
@@ -87,27 +87,17 @@ pipeline:
    eps: 1e-5
    global_scanned_feat: True
  nbv_reconstruction_global_pts_pipeline:
    modules:
      pts_encoder: pointnet_encoder
      pose_seq_encoder: transformer_seq_encoder
      pose_encoder: pose_encoder
      view_finder: gf_view_finder
    eps: 1e-5
    global_scanned_feat: True
 module:
  pointnet_encoder:
    in_dim: 3
-    out_dim: 1024
+    out_dim: 512
    global_feat: True
    feature_transform: False
  transformer_seq_encoder:
-    embed_dim: 1344
+    embed_dim: 768
    num_heads: 4
    ffn_dim: 256
    num_layers: 3
@@ -128,6 +118,9 @@ module:
    pose_dim: 9
    out_dim: 256
  pts_num_encoder:
    out_dim: 64
 loss_function:
  gf_loss:
--- a/configs/server/server_view_generate_config.yaml
+++ b/configs/server/server_view_generate_config.yaml
@@ -1,53 +0,0 @@
 runner:
  general:
    seed: 0
    device: cpu
    cuda_visible_devices: 0,1,2,3,4,5,6,7
  experiment:
    name: debug
    root_dir: experiments
  generate:
    object_dir: ../data/scaled_object_meshes
    table_model_path: ../data/others/table.obj
    output_dir: ../data/nbv_reconstruction_data_512
    binocular_vision: true
    plane_size: 10
    max_views: 512
    min_views: 64
    max_diag: 0.7
    min_diag: 0.1
    random_config:
      display_table:
        min_height: 0.05
        max_height: 0.15
        min_radius: 0.3
        max_radius: 0.5
        min_R: 0.05
        max_R: 0.3
        min_G: 0.05
        max_G: 0.3
        min_B: 0.05
        max_B: 0.3
      display_object:
        min_x: 0
        max_x: 0.03
        min_y: 0
        max_y: 0.03
        min_z: 0.01
        max_z: 0.01
        random_rotation_ratio: 0.3
      random_objects:
        num: 4
        cluster: 0.9
    light_and_camera_config:
      Camera:
        near_plane: 0.01
        far_plane: 5
        fov_vertical: 25
        resolution: [1280,800]
        eye_distance: 0.15
        eye_angle: 25
      Light:
        location: [0,0,3.5]
        orientation: [0,0,0]
        power: 150
--- a/core/nbv_dataset.py
+++ b/core/nbv_dataset.py
@@ -8,7 +8,7 @@ import torch
 import os
 import sys
-sys.path.append(r"/home/data/hofee/project/nbv_rec/nbv_reconstruction")
+sys.path.append(r"/data/hofee/project/nbv_rec/nbv_reconstruction")
 from utils.data_load import DataLoadUtil
 from utils.pose import PoseUtil
@@ -31,10 +31,10 @@ class NBVReconstructionDataset(BaseDataset):
        self.load_from_preprocess = config.get("load_from_preprocess", False)
        if self.type == namespace.Mode.TEST:
-            self.model_dir = config["model_dir"]
+            #self.model_dir = config["model_dir"]
            self.filter_degree = config["filter_degree"]
        if self.type == namespace.Mode.TRAIN:
-            scale_ratio = 1
+            scale_ratio = 50
            self.datalist = self.datalist*scale_ratio
        if self.cache:
            expr_root = ConfigManager.get("runner", "experiment", "root_dir")
@@ -66,7 +66,9 @@ class NBVReconstructionDataset(BaseDataset):
                if max_coverage_rate > scene_max_coverage_rate:
                    scene_max_coverage_rate = max_coverage_rate
                max_coverage_rate_list.append(max_coverage_rate)
-            mean_coverage_rate = np.mean(max_coverage_rate_list)
+                
            if max_coverage_rate_list:
                mean_coverage_rate = np.mean(max_coverage_rate_list)
            for seq_idx in range(seq_num):
                label_path = DataLoadUtil.get_label_path(
@@ -112,6 +114,10 @@ class NBVReconstructionDataset(BaseDataset):
        except Exception as e:
            Log.error(f"Save cache failed: {e}")
    def voxel_downsample_with_mask(self, pts, voxel_size):
        pass
    def __getitem__(self, index):
        data_item_info = self.datalist[index]
        scanned_views = data_item_info["scanned_views"]
@@ -122,7 +128,7 @@ class NBVReconstructionDataset(BaseDataset):
            scanned_views_pts,
            scanned_coverages_rate,
            scanned_n_to_world_pose,
-        ) = ([], [], [], [])
+        ) = ([], [], [])
        for view in scanned_views:
            frame_idx = view[0]
            coverage_rate = view[1]
@@ -159,28 +165,8 @@ class NBVReconstructionDataset(BaseDataset):
            [best_to_world_6d, best_to_world_trans], axis=0
        )
        combined_scanned_views_pts = np.concatenate(scanned_views_pts, axis=0)
        fps_downsampled_combined_scanned_pts, fps_idx = PtsUtil.fps_downsample_point_cloud(
            combined_scanned_views_pts, self.pts_num, require_idx=True
        )
        combined_scanned_views_pts_mask = np.zeros(len(scanned_views_pts), dtype=np.uint8)
        start_idx = 0
        for i in range(len(scanned_views_pts)):
            end_idx = start_idx + len(scanned_views_pts[i])
            combined_scanned_views_pts_mask[start_idx:end_idx] = i
            start_idx = end_idx
        fps_downsampled_combined_scanned_pts_mask = combined_scanned_views_pts_mask[fps_idx]
        data_item = {
            "scanned_pts": np.asarray(scanned_views_pts, dtype=np.float32), # Ndarray(S x Nv x 3)
            "scanned_pts_mask": np.asarray(fps_downsampled_combined_scanned_pts_mask,dtype=np.uint8), # Ndarray(N), range(0, S)
            "combined_scanned_pts": np.asarray(fps_downsampled_combined_scanned_pts, dtype=np.float32), # Ndarray(N x 3)
            "scanned_coverage_rate": scanned_coverages_rate, # List(S): Float, range(0, 1)
            "scanned_n_to_world_pose_9d": np.asarray(scanned_n_to_world_pose, dtype=np.float32), # Ndarray(S x 9)
            "best_coverage_rate": nbv_coverage_rate, # Float, range(0, 1)
@@ -212,12 +198,6 @@ class NBVReconstructionDataset(BaseDataset):
            collate_data["best_to_world_pose_9d"] = torch.stack(
                [torch.tensor(item["best_to_world_pose_9d"]) for item in batch]
            )
            collate_data["combined_scanned_pts"] = torch.stack(
                [torch.tensor(item["combined_scanned_pts"]) for item in batch]
            )
            collate_data["scanned_pts_mask"] = torch.stack(
                [torch.tensor(item["scanned_pts_mask"]) for item in batch]
            )
            for key in batch[0].keys():
                if key not in [
@@ -225,7 +205,6 @@ class NBVReconstructionDataset(BaseDataset):
                    "scanned_pts_mask",
                    "scanned_n_to_world_pose_9d",
                    "best_to_world_pose_9d",
                    "combined_scanned_pts",
                ]:
                    collate_data[key] = [item[key] for item in batch]
            return collate_data
@@ -241,10 +220,9 @@ if __name__ == "__main__":
    torch.manual_seed(seed)
    np.random.seed(seed)
    config = {
-        "root_dir": "/home/data/hofee/project/nbv_rec/data/nbv_rec_data_512_preproc_npy",
+        "root_dir": "/data/hofee/data/packed_preprocessed_data",
        "model_dir": "/home/data/hofee/project/nbv_rec/data/scaled_object_meshes",
        "source": "nbv_reconstruction_dataset",
-        "split_file": "/home/data/hofee/project/nbv_rec/data/OmniObject3d_test.txt",
+        "split_file": "/data/hofee/data/OmniObject3d_train.txt",
        "load_from_preprocess": True,
        "ratio": 0.5,
        "batch_size": 2,
--- a/core/global_pts_n_num_pipeline.py
+++ b/core/global_pts_n_num_pipeline.py
@@ -1,4 +1,5 @@
 import torch
 import time
 from torch import nn
 import PytorchBoot.namespace as namespace
 import PytorchBoot.stereotype as stereotype
@@ -6,10 +7,10 @@ from PytorchBoot.factory.component_factory import ComponentFactory
 from PytorchBoot.utils import Log
-@stereotype.pipeline("nbv_reconstruction_global_pts_n_num_pipeline")
+@stereotype.pipeline("nbv_reconstruction_pipeline")
-class NBVReconstructionGlobalPointsPipeline(nn.Module):
+class NBVReconstructionPipeline(nn.Module):
    def __init__(self, config):
-        super(NBVReconstructionGlobalPointsPipeline, self).__init__()
+        super(NBVReconstructionPipeline, self).__init__()
        self.config = config
        self.module_config = config["modules"]
@@ -19,12 +20,8 @@ class NBVReconstructionGlobalPointsPipeline(nn.Module):
        self.pose_encoder = ComponentFactory.create(
            namespace.Stereotype.MODULE, self.module_config["pose_encoder"]
        )
-        self.pts_num_encoder = ComponentFactory.create(
+        self.seq_encoder = ComponentFactory.create(
-            namespace.Stereotype.MODULE, self.module_config["pts_num_encoder"]
+            namespace.Stereotype.MODULE, self.module_config["seq_encoder"]
        )
        self.transformer_seq_encoder = ComponentFactory.create(
            namespace.Stereotype.MODULE, self.module_config["transformer_seq_encoder"]
        )
        self.view_finder = ComponentFactory.create(
            namespace.Stereotype.MODULE, self.module_config["view_finder"]
@@ -58,7 +55,10 @@ class NBVReconstructionGlobalPointsPipeline(nn.Module):
        return perturbed_x, random_t, target_score, std
    def forward_train(self, data):
        start_time = time.time()
        main_feat = self.get_main_feat(data)
        end_time = time.time()
        print("get_main_feat time: ", end_time - start_time)
        """ get std """
        best_to_world_pose_9d_batch = data["best_to_world_pose_9d"]
        perturbed_x, random_t, target_score, std = self.pertube_data(
@@ -92,48 +92,23 @@ class NBVReconstructionGlobalPointsPipeline(nn.Module):
        scanned_n_to_world_pose_9d_batch = data[
            "scanned_n_to_world_pose_9d"
        ]  # List(B): Tensor(S x 9)
-        scanned_pts_mask_batch = data[
+        scanned_pts_batch = data[
-            "scanned_pts_mask"
+            "scanned_pts"
-        ]  # Tensor(B x N)
+        ]
        device = next(self.parameters()).device
        embedding_list_batch = []
-        combined_scanned_pts_batch = data["combined_scanned_pts"]  # Tensor(B x N x 3)
+        for scanned_n_to_world_pose_9d, scanned_pts in zip(scanned_n_to_world_pose_9d_batch, scanned_pts_batch):
        global_scanned_feat, perpoint_scanned_feat_batch = self.pts_encoder.encode_points(
            combined_scanned_pts_batch, require_per_point_feat=True
        )  # global_scanned_feat: Tensor(B x Dg), perpoint_scanned_feat: Tensor(B x N x Dl)
        for scanned_n_to_world_pose_9d, scanned_mask, perpoint_scanned_feat in zip(
            scanned_n_to_world_pose_9d_batch,
            scanned_pts_mask_batch,
            perpoint_scanned_feat_batch,
        ):
            scanned_target_pts_num = [] # List(S): Int
            partial_feat_seq = []
            seq_len = len(scanned_n_to_world_pose_9d)
            for seq_idx in range(seq_len): 
                partial_idx_in_combined_pts = scanned_mask == seq_idx # Ndarray(V), N->V idx mask
                partial_perpoint_feat = perpoint_scanned_feat[partial_idx_in_combined_pts] # Ndarray(V x Dl)
                partial_feat = torch.mean(partial_perpoint_feat, dim=0)[0] # Tensor(Dl)
                partial_feat_seq.append(partial_feat)
                scanned_target_pts_num.append(partial_perpoint_feat.shape[0])
            scanned_target_pts_num = torch.tensor(scanned_target_pts_num, dtype=torch.int32).to(device) # Tensor(S)
            scanned_n_to_world_pose_9d = scanned_n_to_world_pose_9d.to(device)  # Tensor(S x 9)
-            
+            scanned_pts = scanned_pts.to(device)  # Tensor(S x N x 3)
-            pose_feat_seq = self.pose_encoder.encode_pose(scanned_n_to_world_pose_9d)  # Tensor(S x Dp)
+            pose_feat_seq = self.pose_encoder.encode_pose(scanned_n_to_world_pose_9d)  # Tensor(S x Dp) 
-            pts_num_feat_seq = self.pts_num_encoder.encode_pts_num(scanned_target_pts_num)  # Tensor(S x Dn)
+            pts_feat_seq = self.pts_encoder.encode_points(scanned_pts, require_per_point_feat=False)  # Tensor(S x Dl)
-            partial_feat_seq = torch.stack(partial_feat_seq) # Tensor(S x Dl)
+            seq_embedding = torch.cat([pose_feat_seq, pts_feat_seq], dim=-1) # Tensor(S x (Dp+Dl))
-            
+            embedding_list_batch.append(seq_embedding) # List(B): Tensor(S x (Dp+Dl))
-            seq_embedding = torch.cat([pose_feat_seq, pts_num_feat_seq, partial_feat_seq], dim=-1) # Tensor(S x (Dp+Dn+Dl))
+        
-            embedding_list_batch.append(seq_embedding) # List(B): Tensor(S x (Dp+Dn+Dl))
+        seq_feat = self.seq_encoder.encode_sequence(embedding_list_batch) # Tensor(B x Ds)
-
+        main_feat = seq_feat # Tensor(B x Ds)
        seq_feat = self.transformer_seq_encoder.encode_sequence(embedding_list_batch) # Tensor(B x Ds)
        main_feat = torch.cat([seq_feat, global_scanned_feat], dim=-1) # Tensor(B x (Ds+Dg))
        if torch.isnan(main_feat).any():
            Log.error("nan in main_feat", True)
--- a/preprocess/clean_preprocessed_data.py
+++ b/preprocess/clean_preprocessed_data.py
@@ -0,0 +1,43 @@
 import os
 import shutil
 def clean_scene_data(root, scene):
    # 清理目标点云数据
    pts_dir = os.path.join(root, scene, "pts")
    if os.path.exists(pts_dir):
        shutil.rmtree(pts_dir)
        print(f"已删除 {pts_dir}")
    # 清理法线数据
    nrm_dir = os.path.join(root, scene, "nrm")
    if os.path.exists(nrm_dir):
        shutil.rmtree(nrm_dir)
        print(f"已删除 {nrm_dir}")
    # 清理扫描点索引数据
    scan_points_indices_dir = os.path.join(root, scene, "scan_points_indices")
    if os.path.exists(scan_points_indices_dir):
        shutil.rmtree(scan_points_indices_dir)
        print(f"已删除 {scan_points_indices_dir}")
    # 删除扫描点数据文件
    scan_points_file = os.path.join(root, scene, "scan_points.txt")
    if os.path.exists(scan_points_file):
        os.remove(scan_points_file)
        print(f"已删除 {scan_points_file}")
 def clean_all_scenes(root, scene_list):
    for idx, scene in enumerate(scene_list):
        print(f"正在清理场景 {scene} ({idx+1}/{len(scene_list)})")
        clean_scene_data(root, scene)
 if __name__ == "__main__":
    root = r"c:\Document\Local Project\nbv_rec\nbv_reconstruction\temp"
    scene_list = os.listdir(root)
    from_idx = 0
    to_idx = len(scene_list)
    print(f"正在清理场景 {scene_list[from_idx:to_idx]}")
    clean_all_scenes(root, scene_list[from_idx:to_idx])
    print("清理完成")
--- a/preprocess/pack_preprocessed_data.py
+++ b/preprocess/pack_preprocessed_data.py
@@ -0,0 +1,48 @@
 import os
 import shutil
 def pack_scene_data(root, scene, output_dir):
    scene_dir = os.path.join(output_dir, scene)
    if not os.path.exists(scene_dir):
        os.makedirs(scene_dir)
    pts_dir = os.path.join(root, scene, "pts")
    if os.path.exists(pts_dir):
        shutil.move(pts_dir, os.path.join(scene_dir, "pts"))
    scan_points_indices_dir = os.path.join(root, scene, "scan_points_indices")
    if os.path.exists(scan_points_indices_dir):
        shutil.move(scan_points_indices_dir, os.path.join(scene_dir, "scan_points_indices")) 
    scan_points_file = os.path.join(root, scene, "scan_points.txt")
    if os.path.exists(scan_points_file):
        shutil.move(scan_points_file, os.path.join(scene_dir, "scan_points.txt"))
    model_pts_nrm_file = os.path.join(root, scene, "points_and_normals.txt")
    if os.path.exists(model_pts_nrm_file):
        shutil.move(model_pts_nrm_file, os.path.join(scene_dir, "points_and_normals.txt"))
    camera_dir = os.path.join(root, scene, "camera_params")
    if os.path.exists(camera_dir):
        shutil.move(camera_dir, os.path.join(scene_dir, "camera_params"))
    scene_info_file = os.path.join(root, scene, "scene_info.json")
    if os.path.exists(scene_info_file):
        shutil.move(scene_info_file, os.path.join(scene_dir, "scene_info.json"))
 def pack_all_scenes(root, scene_list, output_dir):
    for idx, scene in enumerate(scene_list):
        print(f"正在打包场景 {scene} ({idx+1}/{len(scene_list)})")
        pack_scene_data(root, scene, output_dir)
 if __name__ == "__main__":
    root = r"H:\AI\Datasets\nbv_rec_part2"
    output_dir = r"H:\AI\Datasets\scene_info_part2"
    scene_list = os.listdir(root)
    from_idx = 0
    to_idx = len(scene_list)
    print(f"正在打包场景 {scene_list[from_idx:to_idx]}")
    pack_all_scenes(root, scene_list[from_idx:to_idx], output_dir)
    print("打包完成")
--- a/preprocess/pack_upload_data.py
+++ b/preprocess/pack_upload_data.py
@@ -0,0 +1,41 @@
 import os
 import shutil
 def pack_scene_data(root, scene, output_dir):
    scene_dir = os.path.join(output_dir, scene)
    if not os.path.exists(scene_dir):
        os.makedirs(scene_dir)
    pts_dir = os.path.join(root, scene, "pts")
    if os.path.exists(pts_dir):
        shutil.move(pts_dir, os.path.join(scene_dir, "pts"))
    camera_dir = os.path.join(root, scene, "camera_params")
    if os.path.exists(camera_dir):
        shutil.move(camera_dir, os.path.join(scene_dir, "camera_params"))
    scene_info_file = os.path.join(root, scene, "scene_info.json")
    if os.path.exists(scene_info_file):
        shutil.move(scene_info_file, os.path.join(scene_dir, "scene_info.json"))
    label_dir = os.path.join(root, scene, "label")
    if os.path.exists(label_dir):
        shutil.move(label_dir, os.path.join(scene_dir, "label"))
 def pack_all_scenes(root, scene_list, output_dir):
    for idx, scene in enumerate(scene_list):
        print(f"packing {scene} ({idx+1}/{len(scene_list)})")
        pack_scene_data(root, scene, output_dir)
 if __name__ == "__main__":
    root = r"H:\AI\Datasets\nbv_rec_part2"
    output_dir = r"H:\AI\Datasets\upload_part2"
    scene_list = os.listdir(root)
    from_idx = 0
    to_idx = len(scene_list)
    print(f"packing {scene_list[from_idx:to_idx]}")
    pack_all_scenes(root, scene_list[from_idx:to_idx], output_dir)
    print("packing done")
--- a/preprocess/preprocessor.py
+++ b/preprocess/preprocessor.py
@@ -9,8 +9,6 @@ from utils.reconstruction import ReconstructionUtil
 from utils.data_load import DataLoadUtil
 from utils.pts import PtsUtil
 # scan shoe 536
 def save_np_pts(path, pts: np.ndarray, file_type="txt"):
    if file_type == "txt":
        np.savetxt(path, pts)
@@ -23,6 +21,12 @@ def save_target_points(root, scene, frame_idx, target_points: np.ndarray, file_t
    if not os.path.exists(os.path.join(root,scene, "pts")):
        os.makedirs(os.path.join(root,scene, "pts"))
    save_np_pts(pts_path, target_points, file_type)
 def save_target_normals(root, scene, frame_idx, target_normals: np.ndarray, file_type="txt"):
    pts_path = os.path.join(root,scene, "nrm", f"{frame_idx}.{file_type}")
    if not os.path.exists(os.path.join(root,scene, "nrm")):
        os.makedirs(os.path.join(root,scene, "nrm"))
    save_np_pts(pts_path, target_normals, file_type)
 def save_scan_points_indices(root, scene, frame_idx, scan_points_indices: np.ndarray, file_type="txt"):
    indices_path = os.path.join(root,scene, "scan_points_indices", f"{frame_idx}.{file_type}")
@@ -49,6 +53,26 @@ def get_world_points(depth, mask, cam_intrinsic, cam_extrinsic, random_downsampl
    return points_camera_world
 def get_world_points_and_normal(depth, mask, normal, cam_intrinsic, cam_extrinsic, random_downsample_N):
    z = depth[mask]
    i, j = np.nonzero(mask)
    x = (j - cam_intrinsic[0, 2]) * z / cam_intrinsic[0, 0]
    y = (i - cam_intrinsic[1, 2]) * z / cam_intrinsic[1, 1]
    points_camera = np.stack((x, y, z), axis=-1).reshape(-1, 3)
    normal_camera = normal[mask].reshape(-1, 3)
    sampled_target_points, idx = PtsUtil.random_downsample_point_cloud(
                points_camera, random_downsample_N, require_idx=True
            )
    if len(sampled_target_points) == 0:
        return np.zeros((0, 3)), np.zeros((0, 3))
    sampled_normal_camera  = normal_camera[idx]
    points_camera_aug = np.concatenate((sampled_target_points, np.ones((sampled_target_points.shape[0], 1))), axis=-1)
    points_camera_world = np.dot(cam_extrinsic, points_camera_aug.T).T[:, :3]
    return points_camera_world, sampled_normal_camera
 def get_scan_points_indices(scan_points, mask, display_table_mask_label, cam_intrinsic, cam_extrinsic):
    scan_points_homogeneous = np.hstack((scan_points, np.ones((scan_points.shape[0], 1))))
    points_camera = np.dot(np.linalg.inv(cam_extrinsic), scan_points_homogeneous.T).T[:, :3]
@@ -67,8 +91,8 @@ def get_scan_points_indices(scan_points, mask, display_table_mask_label, cam_int
 def save_scene_data(root, scene, scene_idx=0, scene_total=1,file_type="txt"):
    ''' configuration '''
-    target_mask_label = (0, 255, 0, 255)
+    target_mask_label = (0, 255, 0)
-    display_table_mask_label=(0, 0, 255, 255)
+    display_table_mask_label=(0, 0, 255)
    random_downsample_N = 32768
    voxel_size=0.003
    filter_degree = 75
@@ -93,7 +117,7 @@ def save_scene_data(root, scene, scene_idx=0, scene_total=1,file_type="txt"):
                binocular=True
            )
        mask_L, mask_R = DataLoadUtil.load_seg(path, binocular=True)
-        
+        normal_L = DataLoadUtil.load_normal(path, binocular=True, left_only=True)
        ''' target points '''
        mask_img_L = mask_L
        mask_img_R = mask_R
@@ -102,23 +126,23 @@ def save_scene_data(root, scene, scene_idx=0, scene_total=1,file_type="txt"):
        target_mask_img_R = (mask_R == target_mask_label).all(axis=-1)
-        sampled_target_points_L = get_world_points(depth_L, target_mask_img_L, cam_info["cam_intrinsic"], cam_info["cam_to_world"], random_downsample_N)
+        sampled_target_points_L, sampled_target_normal_L = get_world_points_and_normal(depth_L,target_mask_img_L,normal_L, cam_info["cam_intrinsic"], cam_info["cam_to_world"], random_downsample_N)
        sampled_target_points_R = get_world_points(depth_R, target_mask_img_R, cam_info["cam_intrinsic"], cam_info["cam_to_world_R"], random_downsample_N)
        has_points = sampled_target_points_L.shape[0] > 0 and sampled_target_points_R.shape[0] > 0
        if has_points:
-            target_points = PtsUtil.get_overlapping_points(
+            target_points, overlap_idx = PtsUtil.get_overlapping_points(
-                    sampled_target_points_L, sampled_target_points_R, voxel_size
+                    sampled_target_points_L, sampled_target_points_R, voxel_size, require_idx=True
                )
            sampled_target_normal_L = sampled_target_normal_L[overlap_idx]
        if has_points:
            has_points = target_points.shape[0] > 0
        if has_points:
-            points_normals = DataLoadUtil.load_points_normals(root, scene, display_table_as_world_space_origin=True)
+            target_points, target_normals = PtsUtil.filter_points(
-            target_points = PtsUtil.filter_points(
+                target_points, sampled_target_normal_L, cam_info["cam_to_world"], theta_limit = filter_degree, z_range=(min_z, max_z)
                target_points, points_normals, cam_info["cam_to_world"],voxel_size=0.002, theta = filter_degree, z_range=(min_z, max_z)
                )
@@ -129,8 +153,10 @@ def save_scene_data(root, scene, scene_idx=0, scene_total=1,file_type="txt"):
        if not has_points:
            target_points = np.zeros((0, 3))
            target_normals = np.zeros((0, 3))
        save_target_points(root, scene, frame_id, target_points, file_type=file_type)
        save_target_normals(root, scene, frame_id, target_normals, file_type=file_type)
        save_scan_points_indices(root, scene, frame_id, scan_points_indices, file_type=file_type)
    save_scan_points(root, scene, scan_points) # The "done" flag of scene preprocess
@@ -138,17 +164,10 @@ def save_scene_data(root, scene, scene_idx=0, scene_total=1,file_type="txt"):
 if __name__ == "__main__":
    #root = "/media/hofee/repository/new_data_with_normal"
-    root = r"/media/hofee/repository/full_data_output"
+    root = r"H:\AI\Datasets\nbv_rec_part2"
    # list_path = r"/media/hofee/repository/full_list.txt"
    # scene_list = []
    # with open(list_path, "r") as f:
    #     for line in f:
    #         scene_list.append(line.strip())
    scene_list = os.listdir(root)
    from_idx = 0 # 1000
-    to_idx = 700 # 1500
+    to_idx = 600 # 1500
    print(scene_list)
    cnt = 0
@@ -156,6 +175,10 @@ if __name__ == "__main__":
    total = to_idx - from_idx
    for scene in scene_list[from_idx:to_idx]:
        start = time.time()
        if os.path.exists(os.path.join(root, scene, "scan_points.txt")):
            print(f"Scene {scene} has been processed")
            cnt+=1
            continue
        save_scene_data(root, scene, cnt, total, file_type="npy")
        cnt+=1
        end = time.time()
--- a/runners/inferece_server.py
+++ b/runners/inferece_server.py
@@ -0,0 +1,109 @@
 import os
 import json
 import torch
 import numpy as np
 from flask import Flask, request, jsonify
 import PytorchBoot.namespace as namespace
 import PytorchBoot.stereotype as stereotype
 from PytorchBoot.factory import ComponentFactory
 from PytorchBoot.runners.runner import Runner
 from PytorchBoot.utils import Log
 from utils.pts import PtsUtil
@stereotype.runner("inferencer")
 class InferencerServer(Runner):
    def __init__(self, config_path):
        super().__init__(config_path)
        ''' Web Server '''
        self.app = Flask(__name__)
        ''' Pipeline '''
        self.pipeline_name = self.config[namespace.Stereotype.PIPELINE]
        self.pipeline:torch.nn.Module = ComponentFactory.create(namespace.Stereotype.PIPELINE, self.pipeline_name)
        self.pipeline = self.pipeline.to(self.device)
        ''' Experiment '''
        self.load_experiment("nbv_evaluator")
    def get_input_data(self, data):
        input_data = {}
        scanned_pts = data["scanned_pts"]
        scanned_n_to_world_pose_9d = data["scanned_n_to_world_pose_9d"]
        combined_scanned_views_pts = np.concatenate(scanned_pts, axis=0)
        fps_downsampled_combined_scanned_pts, fps_idx = PtsUtil.fps_downsample_point_cloud(
            combined_scanned_views_pts, self.pts_num, require_idx=True
        )
        combined_scanned_views_pts_mask = np.zeros(len(scanned_pts), dtype=np.uint8)
        start_idx = 0
        for i in range(len(scanned_pts)):
            end_idx = start_idx + len(scanned_pts[i])
            combined_scanned_views_pts_mask[start_idx:end_idx] = i
            start_idx = end_idx
        fps_downsampled_combined_scanned_pts_mask = combined_scanned_views_pts_mask[fps_idx]
        input_data["scanned_pts_mask"] = np.asarray(fps_downsampled_combined_scanned_pts_mask, dtype=np.uint8)
        input_data["scanned_n_to_world_pose_9d"] = np.asarray(scanned_n_to_world_pose_9d, dtype=np.float32)
        input_data["combined_scanned_pts"] = np.asarray(fps_downsampled_combined_scanned_pts, dtype=np.float32)
        return input_data
    def get_result(self, output_data):
        estimated_delta_rot_9d = output_data["pred_pose_9d"]
        result = {
            "estimated_delta_rot_9d": estimated_delta_rot_9d.tolist()
        }
        return result
    def run(self):
        Log.info("Loading from epoch {}.".format(self.current_epoch))
        @self.app.route("/inference", methods=["POST"])
        def inference():
            data = request.json
            input_data = self.get_input_data(data)
            output_data = self.pipeline.forward_test(input_data)
            result = self.get_result(output_data)
            return jsonify(result)
        self.app.run(host="0.0.0.0", port=5000)
    def get_checkpoint_path(self, is_last=False):
        return os.path.join(self.experiment_path, namespace.Direcotry.CHECKPOINT_DIR_NAME,
                            "Epoch_{}.pth".format(
                                self.current_epoch if self.current_epoch != -1 and not is_last else "last"))
    def load_checkpoint(self, is_last=False):
        self.load(self.get_checkpoint_path(is_last))
        Log.success(f"Loaded checkpoint from {self.get_checkpoint_path(is_last)}")
        if is_last:
            checkpoint_root = os.path.join(self.experiment_path, namespace.Direcotry.CHECKPOINT_DIR_NAME)
            meta_path = os.path.join(checkpoint_root, "meta.json")
            if not os.path.exists(meta_path):
                raise FileNotFoundError(
                    "No checkpoint meta.json file in the experiment {}".format(self.experiments_config["name"]))
            file_path = os.path.join(checkpoint_root, "meta.json")
            with open(file_path, "r") as f:
                meta = json.load(f)
            self.current_epoch = meta["last_epoch"]
            self.current_iter = meta["last_iter"]
    def load_experiment(self, backup_name=None):
        super().load_experiment(backup_name)
        self.current_epoch = self.experiments_config["epoch"]
        self.load_checkpoint(is_last=(self.current_epoch == -1))
    def create_experiment(self, backup_name=None):
        super().create_experiment(backup_name)
    def load(self, path):
        state_dict = torch.load(path)
        self.pipeline.load_state_dict(state_dict)
--- a/runners/strategy_generator.py
+++ b/runners/strategy_generator.py
@@ -22,25 +22,21 @@ class StrategyGenerator(Runner):
            "app_name": "generate_strategy",
            "runner_name": "strategy_generator"
        }
        self.to_specified_dir = ConfigManager.get("runner", "generate", "to_specified_dir")
        self.save_best_combined_pts = ConfigManager.get("runner", "generate", "save_best_combined_points")
        self.save_mesh = ConfigManager.get("runner", "generate", "save_mesh")
        self.load_pts = ConfigManager.get("runner", "generate", "load_points")
        self.filter_degree = ConfigManager.get("runner", "generate", "filter_degree")
        self.overwrite = ConfigManager.get("runner", "generate", "overwrite")
        self.save_pts = ConfigManager.get("runner","generate","save_points")
        self.seq_num = ConfigManager.get("runner","generate","seq_num")
        self.overlap_area_threshold = ConfigManager.get("runner","generate","overlap_area_threshold")
        self.compute_with_normal = ConfigManager.get("runner","generate","compute_with_normal")
        self.scan_points_threshold = ConfigManager.get("runner","generate","scan_points_threshold")
    def run(self):
        dataset_name_list =  ConfigManager.get("runner", "generate", "dataset_list")
-        voxel_threshold, soft_overlap_threshold, hard_overlap_threshold = ConfigManager.get("runner","generate","voxel_threshold"), ConfigManager.get("runner","generate","soft_overlap_threshold"), ConfigManager.get("runner","generate","hard_overlap_threshold")
+        voxel_threshold = ConfigManager.get("runner","generate","voxel_threshold")
        for dataset_idx in range(len(dataset_name_list)):
            dataset_name = dataset_name_list[dataset_idx]
            status_manager.set_progress("generate_strategy", "strategy_generator", "dataset", dataset_idx, len(dataset_name_list))
            root_dir = ConfigManager.get("datasets", dataset_name, "root_dir")
            model_dir = ConfigManager.get("datasets", dataset_name, "model_dir")
            from_idx = ConfigManager.get("datasets",dataset_name,"from")
            to_idx = ConfigManager.get("datasets",dataset_name,"to")
            scene_name_list = os.listdir(root_dir)
@@ -52,17 +48,13 @@ class StrategyGenerator(Runner):
            for scene_name in scene_name_list[from_idx:to_idx]:
                Log.info(f"({dataset_name})Processing [{cnt}/{total}]: {scene_name}")
                status_manager.set_progress("generate_strategy", "strategy_generator", "scene", cnt, total)
                diag = DataLoadUtil.get_bbox_diag(model_dir, scene_name)
                status_manager.set_status("generate_strategy", "strategy_generator", "diagonal", diag)
                output_label_path = DataLoadUtil.get_label_path(root_dir, scene_name,0)
                if os.path.exists(output_label_path) and not self.overwrite:
                    Log.info(f"Scene <{scene_name}> Already Exists, Skip")
                    cnt += 1
                    continue
-                self.generate_sequence(root_dir, model_dir, scene_name,voxel_threshold, soft_overlap_threshold, hard_overlap_threshold)
+                self.generate_sequence(root_dir, scene_name,voxel_threshold)
                # except Exception as e:
                #     Log.error(f"Scene <{scene_name}> Failed, Error: {e}")
                cnt += 1
            status_manager.set_progress("generate_strategy", "strategy_generator", "scene", total, total)
        status_manager.set_progress("generate_strategy", "strategy_generator", "dataset", len(dataset_name_list), len(dataset_name_list))
@@ -75,27 +67,36 @@ class StrategyGenerator(Runner):
    def load_experiment(self, backup_name=None):
        super().load_experiment(backup_name)
-    def generate_sequence(self, root, model_dir, scene_name, voxel_threshold, soft_overlap_threshold, hard_overlap_threshold):
+    def generate_sequence(self, root, scene_name, voxel_threshold):
        status_manager.set_status("generate_strategy", "strategy_generator", "scene", scene_name)
        frame_num = DataLoadUtil.get_scene_seq_length(root, scene_name)
        model_points_normals = DataLoadUtil.load_points_normals(root, scene_name)
        model_pts = model_points_normals[:,:3]
-        down_sampled_model_pts = PtsUtil.voxel_downsample_point_cloud(model_pts, voxel_threshold)
+        down_sampled_model_pts, idx = PtsUtil.voxel_downsample_point_cloud(model_pts, voxel_threshold, require_idx=True)
        down_sampled_model_nrm = model_points_normals[idx, 3:]
        pts_list = []
        nrm_list = []
        scan_points_indices_list = []
        non_zero_cnt = 0
        for frame_idx in range(frame_num):
            status_manager.set_progress("generate_strategy", "strategy_generator", "loading frame", frame_idx, frame_num)
            pts_path = os.path.join(root,scene_name, "pts", f"{frame_idx}.npy")
            nrm_path = os.path.join(root,scene_name, "nrm", f"{frame_idx}.npy")
            idx_path = os.path.join(root,scene_name, "scan_points_indices", f"{frame_idx}.npy")
            point_cloud = np.load(pts_path)
            sampled_point_cloud = PtsUtil.voxel_downsample_point_cloud(point_cloud, voxel_threshold)
            indices = np.load(idx_path) 
            pts_list.append(sampled_point_cloud)
            pts = np.load(pts_path)
            if self.compute_with_normal:
                if pts.shape[0] == 0:
                    nrm = np.zeros((0,3))
                else:
                    nrm = np.load(nrm_path)
                nrm_list.append(nrm)
            pts_list.append(pts)
            indices = np.load(idx_path) 
            scan_points_indices_list.append(indices)
-            if sampled_point_cloud.shape[0] > 0:
+            if pts.shape[0] > 0:
                non_zero_cnt += 1
        status_manager.set_progress("generate_strategy", "strategy_generator", "loading frame", frame_num, frame_num)
@@ -103,7 +104,7 @@ class StrategyGenerator(Runner):
        init_view_list = []
        idx = 0
        while len(init_view_list) < seq_num and idx < len(pts_list):
-            if pts_list[idx].shape[0] > 100:
+            if pts_list[idx].shape[0] > 50:
                init_view_list.append(idx)
            idx += 1
@@ -112,8 +113,13 @@ class StrategyGenerator(Runner):
        for init_view in init_view_list:
            status_manager.set_progress("generate_strategy", "strategy_generator", "computing sequence", seq_idx, len(init_view_list))
            start = time.time()
-            limited_useful_view, _, _ = ReconstructionUtil.compute_next_best_view_sequence_with_overlap(down_sampled_model_pts, pts_list, scan_points_indices_list = scan_points_indices_list,init_view=init_view, 
+
-                                                                                                        threshold=voxel_threshold, soft_overlap_threshold=soft_overlap_threshold, hard_overlap_threshold= hard_overlap_threshold, scan_points_threshold=10, status_info=self.status_info)
+            if not self.compute_with_normal:    
                limited_useful_view, _, _ = ReconstructionUtil.compute_next_best_view_sequence(down_sampled_model_pts, pts_list, scan_points_indices_list = scan_points_indices_list,init_view=init_view, 
                                                                                                            threshold=voxel_threshold, scan_points_threshold=self.scan_points_threshold, overlap_area_threshold=self.overlap_area_threshold, status_info=self.status_info)
            else:
                limited_useful_view, _, _ = ReconstructionUtil.compute_next_best_view_sequence_with_normal(down_sampled_model_pts, down_sampled_model_nrm, pts_list, nrm_list, scan_points_indices_list = scan_points_indices_list,init_view=init_view, 
                                                                                                            threshold=voxel_threshold, scan_points_threshold=self.scan_points_threshold, overlap_area_threshold=self.overlap_area_threshold, status_info=self.status_info)
            end = time.time()
            print(f"Time: {end-start}")
            data_pairs = self.generate_data_pairs(limited_useful_view)
@@ -132,8 +138,6 @@ class StrategyGenerator(Runner):
            with open(output_label_path, 'w') as f:
                json.dump(seq_save_data, f)
            seq_idx += 1
        if self.save_mesh:
            DataLoadUtil.save_target_mesh_at_world_space(root, model_dir, scene_name)
        status_manager.set_progress("generate_strategy", "strategy_generator", "computing sequence", len(init_view_list), len(init_view_list))
--- a/runners/view_generator.py
+++ b/runners/view_generator.py
@@ -9,7 +9,7 @@ class ViewGenerator(Runner):
        self.config_path = config_path
    def run(self):
-        result = subprocess.run(['blender', '-b', '-P', '../blender/run_blender.py', '--', self.config_path])
+        result = subprocess.run(['/home/hofee/blender-4.0.2-linux-x64/blender', '-b', '-P', '../blender/run_blender.py', '--', self.config_path])
        print()
    def create_experiment(self, backup_name=None):
--- a/utils/data_load.py
+++ b/utils/data_load.py
@@ -4,11 +4,30 @@ import json
 import cv2
 import trimesh
 import torch
 import OpenEXR
 import Imath
 from utils.pts import PtsUtil
 class DataLoadUtil:
    TABLE_POSITION = np.asarray([0, 0, 0.8215])
    @staticmethod
    def load_exr_image(file_path):
        exr_file = OpenEXR.InputFile(file_path)
        header = exr_file.header()
        dw = header['dataWindow']
        width = dw.max.x - dw.min.x + 1
        height = dw.max.y - dw.min.y + 1
        float_channels = ['R', 'G', 'B']
        img_data = []
        for channel in float_channels:
            channel_data = exr_file.channel(channel)
            img_data.append(np.frombuffer(channel_data, dtype=np.float16).reshape((height, width)))
        # 将各通道组合成一个 (height, width, 3) 的 RGB 图像
        img = np.stack(img_data, axis=-1)
        return img
    @staticmethod
    def get_display_table_info(root, scene_name):
@@ -34,6 +53,8 @@ class DataLoadUtil:
    @staticmethod
    def get_label_num(root, scene_name):
        label_dir = os.path.join(root, scene_name, "label")
        if not os.path.exists(label_dir):
            return 0
        return len(os.listdir(label_dir))
    @staticmethod
@@ -44,11 +65,6 @@ class DataLoadUtil:
        path = os.path.join(label_dir, f"{seq_idx}.json")
        return path
    @staticmethod
    def get_label_path_old(root, scene_name):
        path = os.path.join(root, scene_name, "label.json")
        return path
    @staticmethod
    def get_scene_seq_length(root, scene_name):
        camera_params_path = os.path.join(root, scene_name, "camera_params")
@@ -69,36 +85,6 @@ class DataLoadUtil:
        diagonal_length = np.linalg.norm(bbox)
        return diagonal_length
    @staticmethod
    def save_mesh_at(model_dir, output_dir, object_name, scene_name, world_object_pose):
        mesh = DataLoadUtil.load_mesh_at(model_dir, object_name, world_object_pose)
        model_path = os.path.join(output_dir, scene_name, "world_mesh.obj")
        mesh.export(model_path)
    @staticmethod
    def save_target_mesh_at_world_space(
        root, model_dir, scene_name, display_table_as_world_space_origin=True
    ):
        scene_info = DataLoadUtil.load_scene_info(root, scene_name)
        target_name = scene_info["target_name"]
        transformation = scene_info[target_name]
        if display_table_as_world_space_origin:
            location = transformation["location"] - DataLoadUtil.get_display_table_top(
                root, scene_name
            )
        else:
            location = transformation["location"]
        rotation_euler = transformation["rotation_euler"]
        pose_mat = trimesh.transformations.euler_matrix(*rotation_euler)
        pose_mat[:3, 3] = location
        mesh = DataLoadUtil.load_mesh_at(model_dir, target_name, pose_mat)
        mesh_dir = os.path.join(root, scene_name, "mesh")
        if not os.path.exists(mesh_dir):
            os.makedirs(mesh_dir)
        model_path = os.path.join(mesh_dir, "world_target_mesh.obj")
        mesh.export(model_path)
    @staticmethod
    def load_scene_info(root, scene_name):
        scene_info_path = os.path.join(root, scene_name, "scene_info.json")
@@ -113,17 +99,6 @@ class DataLoadUtil:
            target_pts_num_dict = json.load(f)
        return target_pts_num_dict
    @staticmethod
    def load_target_object_pose(root, scene_name):
        scene_info = DataLoadUtil.load_scene_info(root, scene_name)
        target_name = scene_info["target_name"]
        transformation = scene_info[target_name]
        location = transformation["location"]
        rotation_euler = transformation["rotation_euler"]
        pose_mat = trimesh.transformations.euler_matrix(*rotation_euler)
        pose_mat[:3, 3] = location
        return pose_mat
    @staticmethod
    def load_depth(path, min_depth=0.01, max_depth=5.0, binocular=False):
@@ -161,8 +136,8 @@ class DataLoadUtil:
        if binocular and not left_only:
            def clean_mask(mask_image):
-                green = [0, 255, 0, 255]
+                green = [0, 255, 0]
-                red = [255, 0, 0, 255]
+                red = [255, 0, 0]
                threshold = 2
                mask_image = np.where(
                    np.abs(mask_image - green) <= threshold, green, mask_image
@@ -194,30 +169,31 @@ class DataLoadUtil:
            return mask_image
    @staticmethod
-    def load_normal(path, binocular=False, left_only=False):
+    def load_normal(path, binocular=False, left_only=False, file_type="exr"):
        if binocular and not left_only:
            normal_path_L = os.path.join(
-                os.path.dirname(path), "normal", os.path.basename(path) + "_L.png"
+                os.path.dirname(path), "normal", os.path.basename(path) + f"_L.{file_type}"
            )
-            normal_image_L = cv2.imread(normal_path_L, cv2.IMREAD_COLOR)
+            normal_image_L = DataLoadUtil.load_exr_image(normal_path_L)
            normal_path_R = os.path.join(
-                os.path.dirname(path), "normal", os.path.basename(path) + "_R.png"
+                os.path.dirname(path), "normal", os.path.basename(path) + f"_R.{file_type}"
            )
-            normal_image_R = cv2.imread(normal_path_R, cv2.IMREAD_COLOR)
+            normal_image_R = DataLoadUtil.load_exr_image(normal_path_R)
-            normalized_normal_image_L = normal_image_L / 255.0 * 2.0 - 1.0
+            normalized_normal_image_L = normal_image_L * 2.0 - 1.0
-            normalized_normal_image_R = normal_image_R / 255.0 * 2.0 - 1.0
+            normalized_normal_image_R = normal_image_R * 2.0 - 1.0
            return normalized_normal_image_L, normalized_normal_image_R
        else:
            if binocular and left_only:
                normal_path = os.path.join(
-                    os.path.dirname(path), "normal", os.path.basename(path) + "_L.png"
+                    os.path.dirname(path), "normal", os.path.basename(path) + f"_L.{file_type}"
                )
            else:
                normal_path = os.path.join(
-                    os.path.dirname(path), "normal", os.path.basename(path) + ".png"
+                    os.path.dirname(path), "normal", os.path.basename(path) + f".{file_type}"
                )
-            normal_image = cv2.imread(normal_path, cv2.IMREAD_COLOR)
+            normal_image = DataLoadUtil.load_exr_image(normal_path)
-            normalized_normal_image = normal_image / 255.0 * 2.0 - 1.0
+            normalized_normal_image = normal_image * 2.0 - 1.0
            return normalized_normal_image
    @staticmethod
@@ -227,20 +203,26 @@ class DataLoadUtil:
        return label_data
    @staticmethod
-    def load_rgb(path):
+    def load_from_preprocessed_pts(path, file_type="npy"):
        rgb_path = os.path.join(
            os.path.dirname(path), "rgb", os.path.basename(path) + ".png"
        )
        rgb_image = cv2.imread(rgb_path, cv2.IMREAD_COLOR)
        return rgb_image
    @staticmethod
    def load_from_preprocessed_pts(path):
        npy_path = os.path.join(
-            os.path.dirname(path), "pts", os.path.basename(path) + ".npy"
+            os.path.dirname(path), "pts", os.path.basename(path) + "." + file_type
        )
-        pts = np.load(npy_path)
+        if file_type == "txt":
            pts = np.loadtxt(npy_path)
        else:
            pts = np.load(npy_path)
        return pts
    @staticmethod
    def load_from_preprocessed_nrm(path, file_type="npy"):
        npy_path = os.path.join(
            os.path.dirname(path), "nrm", os.path.basename(path) + "." + file_type
        )
        if file_type == "txt":
            nrm = np.loadtxt(npy_path)
        else:
            nrm = np.load(npy_path)
        return nrm
    @staticmethod
    def cam_pose_transformation(cam_pose_before):
@@ -260,11 +242,12 @@ class DataLoadUtil:
            label_data = json.load(f)
        cam_to_world = np.asarray(label_data["extrinsic"])
        cam_to_world = DataLoadUtil.cam_pose_transformation(cam_to_world)
-        world_to_display_table = np.eye(4)
+        
        world_to_display_table[:3, 3] = -DataLoadUtil.get_display_table_top(
            root_dir, scene_name
        )
        if display_table_as_world_space_origin:
            world_to_display_table = np.eye(4)
            world_to_display_table[:3, 3] = -DataLoadUtil.get_display_table_top(
                root_dir, scene_name
            )
            cam_to_world = np.dot(world_to_display_table, cam_to_world)
        cam_intrinsic = np.asarray(label_data["intrinsic"])
        cam_info = {
--- a/utils/pts.py
+++ b/utils/pts.py
@@ -1,16 +1,21 @@
 import numpy as np
 import open3d as o3d
 import torch
 from scipy.spatial import cKDTree
 class PtsUtil:
    @staticmethod
-    def voxel_downsample_point_cloud(point_cloud, voxel_size=0.005):
+    def voxel_downsample_point_cloud(point_cloud, voxel_size=0.005, require_idx=False):
-        o3d_pc = o3d.geometry.PointCloud()
+        voxel_indices = np.floor(point_cloud / voxel_size).astype(np.int32)
-        o3d_pc.points = o3d.utility.Vector3dVector(point_cloud)
+        if require_idx:
-        downsampled_pc = o3d_pc.voxel_down_sample(voxel_size)
+            _, inverse, counts = np.unique(voxel_indices, axis=0, return_inverse=True, return_counts=True)
-        return np.asarray(downsampled_pc.points)
+            idx_sort = np.argsort(inverse)
            idx_unique = idx_sort[np.cumsum(counts)-counts]
            downsampled_points = point_cloud[idx_unique]
            return downsampled_points, idx_unique
        else:
            unique_voxels = np.unique(voxel_indices, axis=0, return_inverse=True)
            return unique_voxels[0]*voxel_size
    @staticmethod
    def random_downsample_point_cloud(point_cloud, num_points, require_idx=False):
@@ -78,24 +83,24 @@ class PtsUtil:
        return overlapping_points
    @staticmethod
-    def filter_points(points, points_normals, cam_pose,  voxel_size=0.002, theta=45, z_range=(0.2, 0.45)):
+    def filter_points(points, normals, cam_pose, theta_limit=45, z_range=(0.2, 0.45)):
        """ filter with normal """ 
        normals_normalized = normals / np.linalg.norm(normals, axis=1, keepdims=True)
        cos_theta = np.dot(normals_normalized, np.array([0, 0, 1]))
        theta = np.arccos(cos_theta) * 180 / np.pi
        idx = theta < theta_limit
        filtered_sampled_points = points[idx]
        filtered_normals = normals[idx]
        """ filter with z range """
-        points_cam = PtsUtil.transform_point_cloud(points, np.linalg.inv(cam_pose))
+        points_cam = PtsUtil.transform_point_cloud(filtered_sampled_points, np.linalg.inv(cam_pose))
        idx = (points_cam[:, 2] > z_range[0]) & (points_cam[:, 2] < z_range[1])
-        z_filtered_points = points[idx]
+        z_filtered_points = filtered_sampled_points[idx]
-        
+        z_filtered_normals = filtered_normals[idx]
-        """ filter with normal """
+        return z_filtered_points[:, :3], z_filtered_normals
-        sampled_points = PtsUtil.voxel_downsample_point_cloud(z_filtered_points, voxel_size)
+    
-        kdtree = cKDTree(points_normals[:,:3])
+    @staticmethod
-        _, indices = kdtree.query(sampled_points)
+    def point_to_hash(point, voxel_size):
-        nearest_points = points_normals[indices]
+        return tuple(np.floor(point / voxel_size).astype(int))
-
+    
        normals = nearest_points[:, 3:]
        camera_axis = -cam_pose[:3, 2] 
        normals_normalized = normals / np.linalg.norm(normals, axis=1, keepdims=True)
        cos_theta = np.dot(normals_normalized, camera_axis)
        theta_rad = np.deg2rad(theta)
        idx = cos_theta > np.cos(theta_rad)
        filtered_sampled_points= sampled_points[idx] 
        return filtered_sampled_points[:, :3]
--- a/utils/reconstruction.py
+++ b/utils/reconstruction.py
@@ -3,25 +3,43 @@ from scipy.spatial import cKDTree
 from utils.pts import PtsUtil
 class ReconstructionUtil:
-    
+
    @staticmethod
    def compute_coverage_rate(target_point_cloud, combined_point_cloud, threshold=0.01):
        kdtree = cKDTree(combined_point_cloud)
        distances, _ = kdtree.query(target_point_cloud)
-        covered_points_num = np.sum(distances < threshold)
+        covered_points_num = np.sum(distances < threshold*2)
        coverage_rate = covered_points_num / target_point_cloud.shape[0]
        return coverage_rate, covered_points_num
    @staticmethod
-    def compute_overlap_rate(new_point_cloud, combined_point_cloud, threshold=0.01):
+    def compute_coverage_rate_with_normal(target_point_cloud, combined_point_cloud, target_normal, combined_normal, threshold=0.01, normal_threshold=0.1):
        kdtree = cKDTree(combined_point_cloud)
        distances, indices = kdtree.query(target_point_cloud)
        is_covered_by_distance = distances < threshold*2
        normal_dots = np.einsum('ij,ij->i', target_normal, combined_normal[indices])
        is_covered_by_normal = normal_dots > normal_threshold
        pts_nrm_target = np.hstack([target_point_cloud, target_normal])
        np.savetxt("pts_nrm_target.txt",  pts_nrm_target)
        pts_nrm_combined = np.hstack([combined_point_cloud, combined_normal])
        np.savetxt("pts_nrm_combined.txt", pts_nrm_combined)
        import ipdb; ipdb.set_trace()
        covered_points_num = np.sum(is_covered_by_distance & is_covered_by_normal)
        coverage_rate = covered_points_num / target_point_cloud.shape[0]
        return coverage_rate, covered_points_num
    @staticmethod
    def check_overlap(new_point_cloud, combined_point_cloud, overlap_area_threshold=25, voxel_size=0.01):
        kdtree = cKDTree(combined_point_cloud)
        distances, _ = kdtree.query(new_point_cloud)
-        overlapping_points = np.sum(distances < threshold)
+        overlapping_points = np.sum(distances < voxel_size*2)
-        if new_point_cloud.shape[0] == 0:
+        cm = 0.01
-            overlap_rate = 0
+        voxel_size_cm = voxel_size / cm
-        else:
+        overlap_area = overlapping_points * voxel_size_cm * voxel_size_cm
-            overlap_rate = overlapping_points / new_point_cloud.shape[0]
+        return overlap_area > overlap_area_threshold
        return overlap_rate
    @staticmethod
@@ -37,14 +55,14 @@ class ReconstructionUtil:
        return new_added_points
    @staticmethod
-    def compute_next_best_view_sequence_with_overlap(target_point_cloud, point_cloud_list, scan_points_indices_list, threshold=0.01, soft_overlap_threshold=0.5, hard_overlap_threshold=0.7, init_view = 0, scan_points_threshold=5, status_info=None):
+    def compute_next_best_view_sequence(target_point_cloud, point_cloud_list, scan_points_indices_list, threshold=0.01, overlap_area_threshold=25, init_view = 0, scan_points_threshold=5, status_info=None):
        selected_views = [init_view]
        combined_point_cloud = point_cloud_list[init_view]
        history_indices = [scan_points_indices_list[init_view]]
        max_rec_pts = np.vstack(point_cloud_list)
        downsampled_max_rec_pts = PtsUtil.voxel_downsample_point_cloud(max_rec_pts, threshold)
-        
+        combined_point_cloud = PtsUtil.voxel_downsample_point_cloud(combined_point_cloud, threshold)
        max_rec_pts_num = downsampled_max_rec_pts.shape[0]
        max_real_rec_pts_coverage, _ = ReconstructionUtil.compute_coverage_rate(target_point_cloud, downsampled_max_rec_pts, threshold)
@@ -57,6 +75,7 @@ class ReconstructionUtil:
        cnt_processed_view = 0
        remaining_views.remove(init_view)
        curr_rec_pts_num = combined_point_cloud.shape[0]
        drop_output_ratio = 0.4
        import time
        while remaining_views:
@@ -66,27 +85,23 @@ class ReconstructionUtil:
            best_covered_num = 0
            for view_index in remaining_views:
                if np.random.rand() < drop_output_ratio:
                    continue
                if point_cloud_list[view_index].shape[0] == 0:
                    continue
                if selected_views:
                    new_scan_points_indices = scan_points_indices_list[view_index]
                    if not ReconstructionUtil.check_scan_points_overlap(history_indices, new_scan_points_indices, scan_points_threshold):
-                        overlap_threshold = hard_overlap_threshold
+                        curr_overlap_area_threshold = overlap_area_threshold
                    else:
-                        overlap_threshold = soft_overlap_threshold
+                        curr_overlap_area_threshold = overlap_area_threshold * 0.5
-                    start = time.time()
+                        
-                    overlap_rate = ReconstructionUtil.compute_overlap_rate(point_cloud_list[view_index],combined_point_cloud, threshold)
+                    if not ReconstructionUtil.check_overlap(point_cloud_list[view_index], combined_point_cloud, overlap_area_threshold = curr_overlap_area_threshold, voxel_size=threshold):
                    end = time.time()
                    # print(f"overlap_rate Time: {end-start}")
                    if overlap_rate < overlap_threshold:
                        continue
                start = time.time()
                new_combined_point_cloud = np.vstack([combined_point_cloud, point_cloud_list[view_index]])
                new_downsampled_combined_point_cloud = PtsUtil.voxel_downsample_point_cloud(new_combined_point_cloud,threshold)
                new_coverage, new_covered_num = ReconstructionUtil.compute_coverage_rate(downsampled_max_rec_pts, new_downsampled_combined_point_cloud, threshold)
                end = time.time()   
                #print(f"compute_coverage_rate Time: {end-start}")
                coverage_increase = new_coverage - current_coverage
                if coverage_increase > best_coverage_increase:
                    best_coverage_increase = coverage_increase
@@ -95,6 +110,100 @@ class ReconstructionUtil:
                    best_combined_point_cloud = new_downsampled_combined_point_cloud
            if best_view is not None:
                if best_coverage_increase <=1e-3 or best_covered_num - current_covered_num <= 5:
                    break
                selected_views.append(best_view)
                best_rec_pts_num = best_combined_point_cloud.shape[0]
                print(f"Current rec pts num: {curr_rec_pts_num}, Best rec pts num: {best_rec_pts_num}, Best cover pts: {best_covered_num}, Max rec pts num: {max_rec_pts_num}")
                print(f"Current coverage: {current_coverage+best_coverage_increase}, Best coverage increase: {best_coverage_increase}, Max Real coverage: {max_real_rec_pts_coverage}")
                current_covered_num = best_covered_num
                curr_rec_pts_num = best_rec_pts_num
                combined_point_cloud = best_combined_point_cloud
                remaining_views.remove(best_view)
                history_indices.append(scan_points_indices_list[best_view])
                current_coverage += best_coverage_increase
                cnt_processed_view += 1
                if status_info is not None:
                    sm = status_info["status_manager"]
                    app_name = status_info["app_name"]
                    runner_name = status_info["runner_name"]
                    sm.set_status(app_name, runner_name, "current coverage", current_coverage)
                    sm.set_progress(app_name, runner_name, "processed view", cnt_processed_view, len(point_cloud_list))
                view_sequence.append((best_view, current_coverage))
            else:
                break
        if status_info is not None:
            sm = status_info["status_manager"]
            app_name = status_info["app_name"]
            runner_name = status_info["runner_name"]
            sm.set_progress(app_name, runner_name, "processed view", len(point_cloud_list), len(point_cloud_list))
        return view_sequence, remaining_views, combined_point_cloud
    @staticmethod
    def compute_next_best_view_sequence_with_normal(target_point_cloud, target_normal, point_cloud_list, normal_list, scan_points_indices_list, threshold=0.01, overlap_area_threshold=25, init_view = 0, scan_points_threshold=5, status_info=None):
        selected_views = [init_view]
        combined_point_cloud = point_cloud_list[init_view]
        combined_normal = normal_list[init_view]
        history_indices = [scan_points_indices_list[init_view]]
        max_rec_pts = np.vstack(point_cloud_list)
        max_rec_nrm = np.vstack(normal_list)
        downsampled_max_rec_pts, idx = PtsUtil.voxel_downsample_point_cloud(max_rec_pts, threshold, require_idx=True)
        downsampled_max_rec_nrm = max_rec_nrm[idx]  
        max_rec_pts_num = downsampled_max_rec_pts.shape[0]
        try:
            max_real_rec_pts_coverage, _ = ReconstructionUtil.compute_coverage_rate_with_normal(target_point_cloud, downsampled_max_rec_pts, target_normal, downsampled_max_rec_nrm, threshold)
        except:
            import ipdb; ipdb.set_trace()
        new_coverage, new_covered_num = ReconstructionUtil.compute_coverage_rate_with_normal(downsampled_max_rec_pts, combined_point_cloud, downsampled_max_rec_nrm, combined_normal, threshold)
        current_coverage = new_coverage
        current_covered_num = new_covered_num
        remaining_views = list(range(len(point_cloud_list)))
        view_sequence = [(init_view, current_coverage)]
        cnt_processed_view = 0
        remaining_views.remove(init_view)
        curr_rec_pts_num = combined_point_cloud.shape[0]
        while remaining_views:
            best_view = None
            best_coverage_increase = -1
            best_combined_point_cloud = None
            best_combined_normal = None
            best_covered_num = 0
            for view_index in remaining_views:
                if point_cloud_list[view_index].shape[0] == 0:
                    continue
                if selected_views:
                    new_scan_points_indices = scan_points_indices_list[view_index]
                    if not ReconstructionUtil.check_scan_points_overlap(history_indices, new_scan_points_indices, scan_points_threshold):
                        curr_overlap_area_threshold = overlap_area_threshold
                    else:
                        curr_overlap_area_threshold = overlap_area_threshold * 0.5
                    if not ReconstructionUtil.check_overlap(point_cloud_list[view_index], combined_point_cloud, overlap_area_threshold = curr_overlap_area_threshold, voxel_size=threshold):
                        continue
                new_combined_point_cloud = np.vstack([combined_point_cloud, point_cloud_list[view_index]])
                new_combined_normal = np.vstack([combined_normal, normal_list[view_index]])
                new_downsampled_combined_point_cloud, idx = PtsUtil.voxel_downsample_point_cloud(new_combined_point_cloud,threshold, require_idx=True)
                new_downsampled_combined_normal = new_combined_normal[idx]
                new_coverage, new_covered_num = ReconstructionUtil.compute_coverage_rate_with_normal(downsampled_max_rec_pts, new_downsampled_combined_point_cloud, downsampled_max_rec_nrm, new_downsampled_combined_normal, threshold)
                coverage_increase = new_coverage - current_coverage
                if coverage_increase > best_coverage_increase:
                    best_coverage_increase = coverage_increase
                    best_view = view_index
                    best_covered_num = new_covered_num
                    best_combined_point_cloud = new_downsampled_combined_point_cloud
                    best_combined_normal = new_downsampled_combined_normal
            if best_view is not None:
                if best_coverage_increase <=1e-3 or best_covered_num - current_covered_num <= 5:
                    break
@@ -106,6 +215,7 @@ class ReconstructionUtil:
                current_covered_num = best_covered_num
                curr_rec_pts_num = best_rec_pts_num
                combined_point_cloud = best_combined_point_cloud
                combined_normal = best_combined_normal
                remaining_views.remove(best_view)
                history_indices.append(scan_points_indices_list[best_view])
                current_coverage += best_coverage_increase
@@ -145,18 +255,6 @@ class ReconstructionUtil:
            attempts += 1
        return points
    @staticmethod
    def compute_covered_scan_points(scan_points, point_cloud, threshold=0.01):
        tree = cKDTree(point_cloud)
        covered_points = []
        indices = []
        for i, scan_point in enumerate(scan_points):
            if tree.query_ball_point(scan_point, threshold):
                covered_points.append(scan_point)
                indices.append(i)
        return covered_points, indices
    @staticmethod
    def check_scan_points_overlap(history_indices, indices2, threshold=5):
        for indices1 in history_indices:
--- a/utils/vis.py
+++ b/utils/vis.py
@@ -0,0 +1,192 @@
 import numpy as np
 import matplotlib.pyplot as plt
 import sys
 import os
 import trimesh
 sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from utils.data_load import DataLoadUtil
 from utils.pts import PtsUtil
 class visualizeUtil:
    @staticmethod
    def save_all_cam_pos_and_cam_axis(root, scene, output_dir):
        length = DataLoadUtil.get_scene_seq_length(root, scene)
        all_cam_pos = []
        all_cam_axis = []
        for i in range(length):
            path = DataLoadUtil.get_path(root, scene, i)
            cam_info = DataLoadUtil.load_cam_info(path, binocular=True)
            cam_pose = cam_info["cam_to_world"]
            cam_pos = cam_pose[:3, 3]
            cam_axis = cam_pose[:3, 2] 
            num_samples = 10
            sample_points = [cam_pos + 0.02*t * cam_axis for t in range(num_samples)]
            sample_points = np.array(sample_points)
            all_cam_pos.append(cam_pos)
            all_cam_axis.append(sample_points)
        all_cam_pos = np.array(all_cam_pos)
        all_cam_axis = np.array(all_cam_axis).reshape(-1, 3)
        np.savetxt(os.path.join(output_dir, "all_cam_pos.txt"), all_cam_pos)
        np.savetxt(os.path.join(output_dir, "all_cam_axis.txt"), all_cam_axis)
    @staticmethod
    def save_all_combined_pts(root, scene, output_dir):
        length = DataLoadUtil.get_scene_seq_length(root, scene)
        all_combined_pts = []   
        for i in range(length):
            path = DataLoadUtil.get_path(root, scene, i)
            pts = DataLoadUtil.load_from_preprocessed_pts(path,"npy")
            if pts.shape[0] == 0:
                continue
            all_combined_pts.append(pts)
        all_combined_pts = np.vstack(all_combined_pts)
        downsampled_all_pts = PtsUtil.voxel_downsample_point_cloud(all_combined_pts, 0.001)
        np.savetxt(os.path.join(output_dir, "all_combined_pts.txt"), downsampled_all_pts)
    @staticmethod
    def save_seq_cam_pos_and_cam_axis(root, scene, frame_idx_list, output_dir):
        all_cam_pos = []
        all_cam_axis = []
        for i in frame_idx_list:
            path = DataLoadUtil.get_path(root, scene, i)
            cam_info = DataLoadUtil.load_cam_info(path, binocular=True)
            cam_pose = cam_info["cam_to_world"]
            cam_pos = cam_pose[:3, 3]
            cam_axis = cam_pose[:3, 2] 
            num_samples = 10
            sample_points = [cam_pos + 0.02*t * cam_axis for t in range(num_samples)]
            sample_points = np.array(sample_points)
            all_cam_pos.append(cam_pos)
            all_cam_axis.append(sample_points)
        all_cam_pos = np.array(all_cam_pos)
        all_cam_axis = np.array(all_cam_axis).reshape(-1, 3)
        np.savetxt(os.path.join(output_dir, "seq_cam_pos.txt"), all_cam_pos)
        np.savetxt(os.path.join(output_dir, "seq_cam_axis.txt"), all_cam_axis)
    @staticmethod
    def save_seq_combined_pts(root, scene, frame_idx_list, output_dir):
        all_combined_pts = []   
        for i in frame_idx_list:
            path = DataLoadUtil.get_path(root, scene, i)
            pts = DataLoadUtil.load_from_preprocessed_pts(path,"npy")
            if pts.shape[0] == 0:
                continue
            all_combined_pts.append(pts)
        all_combined_pts = np.vstack(all_combined_pts)
        downsampled_all_pts = PtsUtil.voxel_downsample_point_cloud(all_combined_pts, 0.001)
        np.savetxt(os.path.join(output_dir, "seq_combined_pts.txt"), downsampled_all_pts)
    @staticmethod
    def save_target_mesh_at_world_space(
        root, model_dir, scene_name, display_table_as_world_space_origin=True
    ):
        scene_info = DataLoadUtil.load_scene_info(root, scene_name)
        target_name = scene_info["target_name"]
        transformation = scene_info[target_name]
        if display_table_as_world_space_origin:
            location = transformation["location"] - DataLoadUtil.get_display_table_top(
                root, scene_name
            )
        else:
            location = transformation["location"]
        rotation_euler = transformation["rotation_euler"]
        pose_mat = trimesh.transformations.euler_matrix(*rotation_euler)
        pose_mat[:3, 3] = location
        mesh = DataLoadUtil.load_mesh_at(model_dir, target_name, pose_mat)
        mesh_dir = os.path.join(root, scene_name, "mesh")
        if not os.path.exists(mesh_dir):
            os.makedirs(mesh_dir)
        model_path = os.path.join(mesh_dir, "world_target_mesh.obj")
        mesh.export(model_path)
    @staticmethod
    def save_points_and_normals(root, scene, frame_idx, output_dir, binocular=False):
        target_mask_label = (0, 255, 0, 255)
        path = DataLoadUtil.get_path(root, scene, frame_idx)
        cam_info = DataLoadUtil.load_cam_info(path, binocular=binocular, display_table_as_world_space_origin=False)
        depth = DataLoadUtil.load_depth(
                path, cam_info["near_plane"], 
                cam_info["far_plane"], 
                binocular=binocular,
            )
        if isinstance(depth, tuple):
            depth = depth[0]
        mask = DataLoadUtil.load_seg(path, binocular=binocular, left_only=True)
        normal = DataLoadUtil.load_normal(path, binocular=binocular, left_only=True)
        ''' target points '''
        if mask is None:
            target_mask_img = np.ones_like(depth, dtype=bool)
        else:
            target_mask_img = (mask == target_mask_label).all(axis=-1)
        cam_intrinsic = cam_info["cam_intrinsic"]
        z = depth[target_mask_img]
        i, j = np.nonzero(target_mask_img)
        x = (j - cam_intrinsic[0, 2]) * z / cam_intrinsic[0, 0]
        y = (i - cam_intrinsic[1, 2]) * z / cam_intrinsic[1, 1]
        random_downsample_N = 1000
        points_camera = np.stack((x, y, z), axis=-1).reshape(-1, 3)
        normal_camera = normal[target_mask_img].reshape(-1, 3)
        sampled_target_points, idx = PtsUtil.random_downsample_point_cloud(
                    points_camera, random_downsample_N, require_idx=True
                )
        if len(sampled_target_points) == 0:
            print("No target points")
        sampled_normal_camera  = normal_camera[idx]
        sampled_visualized_normal = []
        sampled_normal_camera[:, 2] = -sampled_normal_camera[:, 2]
        sampled_normal_camera[:, 1] = -sampled_normal_camera[:, 1]
        num_samples = 10
        for i in range(len(sampled_target_points)):
            sampled_visualized_normal.append([sampled_target_points[i] + 0.02*t * sampled_normal_camera[i] for t in range(num_samples)])
        sampled_visualized_normal = np.array(sampled_visualized_normal).reshape(-1, 3)
        np.savetxt(os.path.join(output_dir, "target_pts.txt"), sampled_target_points)
        np.savetxt(os.path.join(output_dir, "target_normal.txt"), sampled_visualized_normal)
    @staticmethod
    def save_pts_nrm(root, scene, frame_idx, output_dir, binocular=False):
        path = DataLoadUtil.get_path(root, scene, frame_idx)
        pts_world = DataLoadUtil.load_from_preprocessed_pts(path, "npy")
        nrm_camera = DataLoadUtil.load_from_preprocessed_nrm(path, "npy")
        cam_info = DataLoadUtil.load_cam_info(path, binocular=binocular)
        cam_to_world = cam_info["cam_to_world"]
        nrm_world = nrm_camera @ cam_to_world[:3, :3].T
        visualized_nrm = []
        num_samples = 10
        for i in range(len(pts_world)):
            for t in range(num_samples):
                visualized_nrm.append(pts_world[i] - 0.02 * t * nrm_world[i])
        visualized_nrm = np.array(visualized_nrm)
        np.savetxt(os.path.join(output_dir, "nrm.txt"), visualized_nrm)
        np.savetxt(os.path.join(output_dir, "pts.txt"), pts_world)
 # ------ Debug ------
 if __name__ == "__main__":
    root = r"C:\Document\Local Project\nbv_rec\nbv_reconstruction\temp"
    model_dir = r"H:\\AI\\Datasets\\scaled_object_box_meshes"
    scene = "box"
    output_dir = r"C:\Document\Local Project\nbv_rec\nbv_reconstruction\test"
    #visualizeUtil.save_all_cam_pos_and_cam_axis(root, scene, output_dir)
    # visualizeUtil.save_all_combined_pts(root, scene, output_dir)
    # visualizeUtil.save_seq_combined_pts(root, scene, [0, 121, 286, 175, 111,366,45,230,232,225,255,17,199,78,60], output_dir)
    # visualizeUtil.save_seq_cam_pos_and_cam_axis(root, scene, [0, 121, 286, 175, 111,366,45,230,232,225,255,17,199,78,60], output_dir)
    # visualizeUtil.save_target_mesh_at_world_space(root, model_dir, scene)
    #visualizeUtil.save_points_and_normals(root, scene,"10", output_dir, binocular=True)
    visualizeUtil.save_pts_nrm(root, scene, "116", output_dir, binocular=True)
Author	SHA1	Message	Date
hofee	1862dce077	upd	2024-10-29 17:09:36 +00:00
hofee	420e9c97bd	update	2024-10-29 16:59:03 +00:00
hofee	b3a7650d3e	local_only: debug	2024-10-29 16:54:42 +00:00
hofee	8d7299b482	local_only: dataset	2024-10-29 12:40:06 +00:00
hofee	234c8bccc3	local_only: pipeline	2024-10-29 12:39:06 +00:00
hofee	b30e9d535a	global_and_local: config	2024-10-29 12:34:37 +00:00
hofee	d8c95b6f0c	global_and_local: pipeline	2024-10-29 12:32:42 +00:00
hofee	ab31ba46a9	global_and_local: config	2024-10-29 12:29:04 +00:00
hofee	f533104e4a	global_only: pipeline	2024-10-29 12:04:54 +00:00
hofee	a21538c90a	global_only: dataset	2024-10-29 11:41:44 +00:00
hofee	872405e239	remove fps	2024-10-29 11:23:28 +00:00
hofee	b13e45bafc	solve merge	2024-10-29 08:14:43 +00:00
hofee	63a246c0c8	debug new training	2024-10-28 19:15:48 +00:00
hofee	9e39c6c6c9	solve merge	2024-10-28 18:27:16 +00:00
hofee	3c9e2c8d12	solve merge	2024-10-28 18:25:53 +00:00
hofee	a883a31968	solve merge	2024-10-28 17:03:03 +00:00
hofee	49bcf203a8	update	2024-10-28 16:48:34 +00:00
hofee	1c443e533d	add inference_server	2024-10-27 04:17:08 -05:00
hofee	3b9c966fd9	Merge branch 'master' of https://git.hofee.top/hofee/nbv_reconstruction	2024-10-26 03:24:18 -05:00
hofee	a41571e79c	update	2024-10-26 03:24:01 -05:00
hofee	bd27226f0f	solve merge	2024-10-25 14:40:26 +00:00
hofee	5c56dae24f	upd	2024-10-24 20:19:23 +08:00
hofee	ebb1ab3c61	udp	2024-10-24 20:18:47 +08:00
hofee	a1226eb294	update normal in computing strategy	2024-10-23 11:13:18 -05:00
hofee	9d0119549e	Merge branch 'master' of https://git.hofee.top/hofee/nbv_reconstruction	2024-10-23 02:59:18 -05:00
hofee	64891ef189	update normal strategy	2024-10-23 02:58:58 -05:00
hofee	75c70a9e59	fix no normal case	2024-10-23 14:54:53 +08:00
hofee	7e68259f6d	update clean preprocess	2024-10-23 01:03:40 -05:00
hofee	64b22fd0f4	solve merge	2024-10-23 13:59:12 +08:00
hofee	b18c1591b7	load 16bit float	2024-10-23 13:57:45 +08:00
hofee	c55a398b6d	update nrm	2024-10-23 00:47:28 -05:00
hofee	e25f7b3334	add save preprocessed normals	2024-10-23 00:42:18 -05:00
hofee	cd56d9ea58	update readme	2024-10-22 16:42:10 +08:00
hofee	d58c7980ed	update	2024-10-22 16:41:02 +08:00
hofee	41eddda8d4	solve merge	2024-10-22 16:01:56 +08:00
hofee	ccec9b8e8a	add readme.md	2024-10-22 16:01:11 +08:00
hofee	0f61e1d64d	Merge branch 'master' of https://git.hofee.top/hofee/nbv_reconstruction	2024-10-21 07:33:40 +00:00
hofee	9ca0851bf7	debug pipeline	2024-10-21 07:33:32 +00:00
hofee	be67be95e9	solve merge	2024-10-19 19:08:39 +08:00
hofee	c9d05f0c86	merge	2024-10-19 19:07:40 +08:00
hofee	ed569254dc	Merge branch 'master' of https://git.hofee.top/hofee/nbv_reconstruction	2024-10-19 19:06:26 +08:00
hofee	be7ec1a433	update	2024-10-19 19:06:09 +08:00
hofee	d0fbb0f198	remove o3d voxel_downsample	2024-10-17 14:28:19 +00:00
hofee	5dae3c53db	remove mesh from strategy generator	2024-10-17 11:23:08 +00:00
hofee	15d1903080	Merge branch 'master' of https://git.hofee.top/hofee/nbv_reconstruction	2024-10-17 11:15:04 +00:00
hofee	b3344626cf	solve merge	2024-10-17 06:14:46 -05:00
hofee	0267aed6e5	add normal and visualize util	2024-10-17 06:13:18 -05:00
hofee	22e7a1aed4	Merge branch 'master' of https://git.hofee.top/hofee/nbv_reconstruction	2024-10-17 11:11:14 +00:00
hofee	8892b6ed05	sync	2024-10-17 11:07:29 +00:00
hofee	8d92676c34	Merge branch 'master' of https://git.hofee.top/hofee/nbv_reconstruction	2024-10-10 10:16:03 -05:00
hofee	1e4fd13a24	update yaml	2024-10-10 10:15:55 -05:00