This is my reading notes, some of which are comprehensive and some are brief. In addition, some of them embed my opinion with others public notes. The major related areas including object detection, image-to-image translation, and video operation. A paper list is summarized as follows.
- About_detection
- A_convnet_for_non_maximum_suppression
- CoupleNet
- Deformable_Convolutional_Networks
- DSSD
- ExtremeNet
- Faster_RCN
- FCOS
- Learning_non_maximum_suppression
- Mask-RCNN
- MDSSD
- MultiBox
- Non_local_Neural_Networks
- Objects_as_Points
- Object_detection_at_200_Frames_Per_Second
- OR-CNN
- R-FCN
- Relation_Networks_for_Object_Detection
- RepMet
- RepPoints
- ScratchDet
- SNet
- soft_nms
- SRN
- SSD
- Weaving_Multi_scale_Context_for_Single_Shot_Detector
- YOLO
- YOLOv3
- Zero-Shot_Detection
- [BPN]_Single-Shot_Bidirectional_Pyramid_Networks_for_High-Quality_Object_Detection
- [DES]_Single_Shot_Object_Detection_with_Enriched_Semantics
- [Fire_SSD]
- [M-FRCNN]Object_Detection_with_Mask-based_Feature_Encoding
- [RefineDet]_Single-Shot_Refinement_Neural_Network_for_Object_Detection
- [RFB]_Receptive_Field_Block_Net_for_Accurate_and_Fast_Object_Detection
- [S3FD]_Single_Shot_Scale-invariant_Face_Detector
- [SIN]_Structure_Inference_Net
- Attention-GAN_for_Object_Transfiguration_in_Wild_Images
- Composable_Unpaired_Image_to_Image_Translation
- DualGAN_Unsupervised_Dual_Learning_for_Image-to-Image_Translation
- Generating_a_Fusion_Image_One’s_Identity_and_Another’s_Shape
- Learning_to_Deblur_Images_with_Exemplars
- WaterGAN_Unsupervised_Generative_Network_to_Enable_Real-time_Color_Correction_of_Monocular_Underwater_Images
- [cd-GAN]_Conditional_Image-to-Image_Translation
- [CGAN]_Conditional_Generative_Adversarial_Nets
- [CycleGAN]_Unpaired_Image-to-Image_Translation_using_Cycle-Consistent_Adversarial_Networks
- [DCGAN]_Unsupervised_Representation_Learning_with_Deep_Convolutional_Generative_Adversarial_Networks
- [GAN_AC]_Connecting_Generative_Adversarial_Networks_and_Actor-Critic_Methods
- [ITGAN]_Improved_Techniques_for_Training_GANs
- [IWGAN]_Improved_Training_of_Wasserstein_GANs
- [OpticalFlow]_Semi-Supervised_Learning_for_Optical_Flow_with_Generative_Adversarial_Networks
- [pix2pix]_Image-to-Image_Translation_with_Conditional_Adversarial_Networks
- [SRGAN]_Photo-Realistic_Single_Image_Super-Resolution_Using_a_Generative_Adversarial_Network
- [SSGAN]_Semi-supervised_Conditional_GANs
- [STN]_Spatial_Transformer_Networks
- [STSR]_Perceptual_Losses_for_Real-Time_Style_Transfer_and_Super-Resolution
- [UNIT]_Unsupervised_Image-to-Image_Translation_Networks
- About_RL
- A_Survey_on_Transfer_Learning
- Connecting_Generative_Adversarial_Networks_and_Actor_Critic_Methods
- DHER
- Learning_to_Learn_Meta-Critic_Networks_for_Sample_Efficient_Learning
- Self-critical_Sequence_Training_for_Image_Captioning
- Taskonomy
- [A3C]_Asynchronous_Methods_for_Deep_Reinforcement_Learning
- [DDPG]_Continuous_control_with_deep_reinforcement_learning
- [DPG]_Deterministic_Policy_Gradient_Algorithms
- [KTAN]_Knowledge_Transfer_Adversarial_Network
- AboutSTN
- About_DeepSLAM
- Camera_Relocalization_by_Computing_Pairwise_Relative_Poses_Using_Convolutional_Neural_Network
- Continuous_Conditional_Random_Fields_for_Efficient_Regression_in_Large_Fully_Connected_Graphs
- Depth_Map_Prediction_from_a_Single_Image_using_a_Multi_Scale_Deep_Network
- Geometric_loss_functions_for_camera_pose_regression_with_deep_learning
- ICSTN
- Learning_Depth_from_Monocular_Videos_using_Direct_Methods
- Learning_Depth_from_Single_Monocular_Images_Using_Deep_Convolutional_Neural_Fields
- Learning_to_Zoom_a_Saliency_Based_Sampling_Layer_for_Neural_Networks
- Multi_Scale_Continuous_CRFs_as_Sequential_Deep_Networks_for_Monocular_Depth_Estimation
- Object_Based_Affordances_Detection_with_Convolutional_Neural_Networks_and_Dense_Conditional_Random_Fields
- Real_time_self_adaptive_deep_stereo
- [Sparse_to_Dense_ Depth_Prediction_from_Sparse_Depth_Samples_and_a_Single_Image](./Robotics/Sparse_to_Dense_ Depth_Prediction_from_Sparse_Depth_Samples_and_a_Single_Image.md)
- Unsupervised_CNN_for_Single_View_Depth_Estimation_Geometry_to_the_Rescue
- Unsupervised_Learning_of_Monocular_Depth_Estimation_and_Visual_Odometry_with_Deep_Feature_Reconstruction
- Unsupervised_Monocular_Depth_Estimation_with_Left_Right_Consistency
- [AnyNet]_Anytime_Stereo_Image_Depth_Estimation_on_Mobile_Devices
- [BayesianPosNet]_Modelling_Uncertainty_in_Deep_Learning_for_Camera_Relocalization
- [DeMoN]_Depth_and_Motion_Network_for_Learning_Monocular_Stereo
- [PoseLSTM]_Image_based_localization_using_LSTMs_for_structured_feature_correlation
- [PoseNet]_A_Convolutional_Network_for_Real_Time_6_DOF_Camera_Relocalization
- [PSMNet]_Pyramid_Stereo_Matching_Network
- [RARE]_Robust_Scene_Text_Recognition_with_Automatic_Rectification
- [SfMLearner]_Unsupervised_Learning_of_Depth_and_Ego_Motion_from_Video
- [UnDeepVO]_Monocular_Visual_Odometry_through_Unsupervised_Deep_Learning
- BoLTVOS
- Box-driven_Class-wise_Region_Masking_and_Filling_Rate_Guided_Loss_for_Weakly_Supervised_Semantic_Segmentation
- Co-occurrent_Features_in_Semantic_Segmentation
- ConvCRFs
- CRF_as_RNN
- DANet
- DenseCRF
- Discriminative_Training_of_Deep_Fully_connected_Continuous_CRFs_with_Task_specific_Loss
- Dynamic_Video_Segmentation_Network
- FeelVOS
- One_Shot_Instance_Segmentation
- Primary_Video_Object_Segmentation_via_Complementary_CNNs_and_Neighborhood_Reversible_Flow
- PTS
- RGMP
- S4Net
- TensorMask
- Video_Instance_Segmentation
- Video_Object_Segmentation_with_Language_Referring_Expressions
- Video_Segmentation_by_Tracking_Many_Figure_Ground_Segments
- YOLACT
- AboutGAN
- Auto-Directed_Video_Stabilization_with_Robust_L1_Optimal_Camera_Paths
- A_Fast_Orientation_Estimation_Approach_of_Natural_Images
- Deep_multi-scale_video_prediction_beyond_mean_square_error
- Deep_video_deblurring
- Generating_Videos_with_Scene_Dynamics
- Temporal_generative_adversarial_nets_with_singular_value_clipping
- Unsupervised_Learning_for_Physical_Interaction_through_Video_Prediction
- Unsupervised_Learning_of_Video_Representations_using_LSTMs
- [C-RNN-GAN]_Continuous_recurrent_neural_networks_with_adversarial_training
- [CodingFlow]_Enable_Video_Coding_for_Video_Stabilization
- [DBLRGAN]_Adversarial_Spatio-Temporal_Learning_for_Video_Deblurring
- [FFNet]_Video_Fast-Forwarding_via_Reinforcement_Learning
- [GRAN]_Generating_images_with_recurrent_adversarial_networks
- [LAPGAN]_Deep_Generative_Image_Models_using_a_Laplacian_Pyramid_of_Adversarial_Networks
- [MeshFlow]_Minimum_Latency_Online_Video_Stabilization
- [MoCoGAN]_Decomposing_Motion_and_Content_for_Video_Generation
- [SeqGAN]_Sequence_Generative_Adversarial_Nets_with_Policy_Gradient
- [SteadyFlow]_Spatially_Smooth_Optical_Flow_for_Video_Stabilization
- 3Dtracking
- ACT
- Atom
- Cascaded_SiamRPN
- DaSiamRPN
- Deeper_and_Wider_SiamRPN
- Deep_Reinforcement_Learning_for_Visual_Object_Tracking_in_Videos
- GCT
- KCF
- Large_Scale_Object_Mining_for_Object_Discovery_from_Unlabeled_Video
- Learning_Correspondence_from_the_Cycle-Consistency_of_Time
- Learning_Discriminative_Model_Prediction_for_Tracking
- Learning_Dynamic_Memory_Networks_for_Object_Tracking
- Learning_to_Track_Online_Multi-Object_Tracking_by_Decision_Making
- MDNet
- Multi-person_Articulated_Tracking_with_Spatial_and_Temporal_Embeddings
- Prediction_Tracking_Segmentation
- Re3_Real-Time_Recurrent_Regression_Networks_for_Visual_Tracking_of_Generic_Objects
- SiamBM
- SiameseFC
- SiamMask
- SiamMask_E
- SiamRPN++
- SiamRPN
- Spatial_Temporal_Relation_Networks_for_Multi_Object_Tracking
- [ADN]_Action-Decision_Networks_for_Visual_Tracking_with_Deep_Reinforcement_Learning
- [RASNet]
- [ROLO]_Spatially_Supervised_Recurrent_Convolutional_Neural_Networks_for_Visual_Object_Tracking
- End-to-End_Detection_and_Re-identification_Integrated_Net_for_Person_Search
- Looking_Fast_and_Slow_Memory-Guided_Mobile_Video_Object_Detection
- Object_Detection_in_Videos_by_Short_and_Long_Range_Object_Linking
- On_The_Stability_of_Video_Detection_and_Tracking
- Recurrent_Neural_Network_Regularization
- Seq-NMS_for_Video_Object_Detection
- Towards_High_Performance_Video_Object_Detection_for_Mobiles
- [A-LSTM]_Online_Video_Object_Detection_using_Association_LSTM
- [AOD]_Attentional_Network_for_Visual_Object_Detection
- [ATW]_Attention-based_Temporal_Weighted_Convolutional_Neural_Network_for_Action_Recognition
- [Bottleneck-LSTM]_Mobile_Video_Object_Detection_with_Temporally-Aware_Feature_Maps
- [ClosedLoop]_Spatio-Temporal_Closed-Loop_Object_Detection
- [D&T]_Detect_to_Track_and_Track_to_Detect
- [DuATM]_Dual_Attention_Matching_Network_for_Context-Aware_Feature_Sequence_based_Person_Re-Identification
- [MGN]_Learning_Discriminative_Features_with_Multiple_Granularities_for_Person_ReID
- [RAM]_Recurrent_Models_of_Visual_Attention
- [Re-id]_An_Improved_Deep_Learning_Architecture_for_Person_Re-Identification
- [ResAtt]_Residual_Attention_Network_for_Image_Classification
- [SoftAtt]_Recurrent_Soft_Attention_Model_for_Common_Object_Recognition
- [STMN]_Spatial-Temporal_Memory_Networks_for_Video_Object_Detection
- [STSN]_Object_Detection_in_Video_with_Spatiotemporal_Sampling_Networks
- [T-CNN]_Tubelets_with_Convolutional_Neural_Networks_for_Object_Detection_from_Videos
- [TCN]_Object_Detection_from_Video_Tubelets_with_Convolutional_Neural_Networks
- [TPN]_Object_Detection_in_Videos_with_Tubelet_Proposal_Networks