Hello,
I did the recalculation of chunk_transform, camera_transform as well as Extrinsic. In case of first image Iget:
camera_transform # first image
Out[62]: 2022-05-02 00:18:20
2022-05-02 00:18:20 array([[ 0.99989019, 0.00313269, 0.01448406, 0.00538381],
2022-05-02 00:18:20 [ 0.00257335, -0.99925621, 0.03847608, -0.0258201 ],
2022-05-02 00:18:20 [ 0.01459382, -0.03843458, -0.99915454, -0.04189182],
2022-05-02 00:18:20 [ 0. , 0. , 0. , 1. ]])
chunk_transform
Out[63]: 2022-05-02 00:18:23
2022-05-02 00:18:23 array([[-0.51014193, -0.20067086, 1.15149808, 3.29045399],
2022-05-02 00:18:23 [ 0.02437204, -1.25794549, -0.20842397, -1.8702646 ],
2022-05-02 00:18:23 [ 1.16859857, -0.06136575, 0.5070237 , 1.26963717],
2022-05-02 00:18:23 [ 0. , 0. , 0. , 1. ]])
ext_transform = np.matmul(chunk_transform,camera_transform)
Extrinsic = np.linalg.inv(ext_transform)
Extrinsic
Out[66]: 2022-05-02 00:18:48
2022-05-02 00:18:48 array([[-0.30360238, 0.01112263, 0.72286483, 0.09731044],
2022-05-02 00:18:48 [ 0.09509364, 0.77782164, 0.02797098, 1.07889018],
2022-05-02 00:18:48 [-0.71666902, 0.09849601, -0.30251569, 2.88552426],
2022-05-02 00:18:48 [ 0. , 0. , 0. , 1. ]])
The difference for chunk_transform is that it is formed by [s*rot | trans] and not [rot | trans] where s is scale ....
also you cannot do matrix multiplication of Intrinsic and Extrinsic as they do not have same order (3,3) (4,4)
Once you get camera coordinates (Xcam, Ycam, Zcam) from applying Extrinsic to World or Model coordinates you must transform to homogeneous coordinates (Xcam/Zcam, Ycam/Zcam, 1) before applying Intrinsic matrix to get pixel coordinates. Note Intrinsic does not take into account distortion (k1,k2,k3,p1,p2)...
Hopes this helps you get along the way to more wisdom...