• <tr id="yyy80"></tr>
  • <sup id="yyy80"></sup>
  • <tfoot id="yyy80"><noscript id="yyy80"></noscript></tfoot>
  • 99热精品在线国产_美女午夜性视频免费_国产精品国产高清国产av_av欧美777_自拍偷自拍亚洲精品老妇_亚洲熟女精品中文字幕_www日本黄色视频网_国产精品野战在线观看 ?

    DuFNet:Dual Flow Network of Real-Time Semantic Segmentation for Unmanned Driving Application of Internet of Things

    2023-02-17 03:12:06TaoDuanYueLiuJingzeLiZhichaoLianandQianmuLi

    Tao Duan,Yue Liu,Jingze Li,Zhichao Lianand Qianmu Li

    1School of Computer Science and Engineering,Nanjing University of Science and Technology,Nanjing,210094,China

    2School of Cyberspace Security,Nanjing University of Science and Technology,Wuxi,320200,China

    ABSTRACT The application of unmanned driving in the Internet of Things is one of the concrete manifestations of the application of artificial intelligence technology. Image semantic segmentation can help the unmanned driving system by achieving road accessibility analysis.Semantic segmentation is also a challenging technology for image understanding and scene parsing.We focused on the challenging task of real-time semantic segmentation in this paper.In this paper,we proposed a novel fast architecture for real-time semantic segmentation named DuFNet.Starting from the existing work of Bilateral Segmentation Network(BiSeNet),DuFNet proposes a novel Semantic Information Flow(SIF)structure for context information and a novel Fringe Information Flow(FIF)structure for spatial information.We also proposed two kinds of SIF with cascaded and paralleled structures,respectively.The SIF encodes the input stage by stage in the ResNet18 backbone and provides context information for the feature fusion module.Features from previous stages usually contain rich low-level details but high-level semantics for later stages.The multiple convolutions embed in Parallel SIF aggregate the corresponding features among different stages and generate a powerful global context representation with less computational cost.The FIF consists of a pooling layer and an upsampling operator followed by projection convolution layer. The concise component provides more spatial details for the network. Compared with BiSeNet, our work achieved faster speed and comparable performance with 72.34%mIoU accuracy and 78 FPS on Cityscapes Dataset based on the ResNet18 backbone.

    KEYWORDS Real-time semantic segmentation; convolutional neural network; feature fusion; unmanned driving; fringe information flow

    1 Introduction

    The Internet of Things (IoT) has brought unprecedented convenience to people’s lives and can realize real-time tracking of the location and status of“things”[1,2].The application of artificial intelligence technology in the Internet of Things has achieved many results including smart transportation,smart home [3,4] and smart wear [5-7]. Unmanned driving is one of the hottest fields in intelligent transportation applications,and great development has been achieved by utilizing the computer vision algorithm.

    As an important topic in computer vision,image semantic segmentation aims to assign each pixel of the image a pre-defined class label [8], which can help the unmanned driving system recognize different entities near the current road,and then guide corresponding driving choices to avoid traffic jams or traffic accidents[9,10].In addition,in order to reduce latency,existing studies often combine IoT with edge computing [11]. Edge computing has emerged as a promising solution to address the limitations of cloud computing in supporting delay-sensitive and context-aware services in the IoT era [12-14]. This allows the unmanned driving system to focus on its surroundings while moving at high speeds.Apart from autonomous driving[15,16]mentioned above,semantic segmentation also has wide-ranging applications,such as scene parsing[17]and medical segmentation[18].Many previous approaches focus on research high-quality semantic segmentation but do not pay enough attention to real-time image segmentation[19,20].Especially in the unmanned driving scene[21],the development of urban roads and the increase in car speed put higher demand for the faster response for realtime segmentation algorithms.Thus,we are committed to improving the speed of the model without sacrificing segmentation accuracy.

    Towards more accurate prediction, many approaches rely on novel architectures and strong modules. Certain well-known backbones have achieved impressive results as great prior feature representations,e.g.,AlexNet[22],ImageNet[23,24],VGGNet[25]and ResNet[26].ResNet is widely known due to Residual Block and its identity skip connections. The goal of the module is clear the current layer can learn features that are different from the information encoded from previous layers already.

    As a matter of fact,the internal idea behind this kind of method is that good performance depends on the fusion of semantic features and spatial features(or boundary details)[27-29].In the residual block, the deeper layers extract semantic information through bigger receptive fields and the lower layers retain the spatial details.In BiSeNet[30],the Feature Fusion Module fused features from the Context Path and Spatial Path which provide encoded context information and spatial information respectively. To realize this theory, many methods employ two- or multi-branch architecture. They build strong semantic representations based on a state-of-the-art backbone with deeper structure and generate a spatial prediction map which has rich detailed information with a lightweight network.

    The Boundary-Aware Network (BANet) [31] proposed a two-stream framework to achieve semantic segmentation which is not completely separated based on BiSeNet’s feature fusion module.The Dual Stream Segmentation Network(DSSNet)[32]introduced its attention module and pyramid pooling module based on BiSeNet. Inspired by BiSeNet, TB-Net [33] proposed a three-stream boundary aware network which changes the context path with a context-aware attention module and adds Boundary Stream to enhance the segmentation performance, particularly for thin and small objects.

    In addition, a few approaches try to capture multi-scale context information at the same stage instead of different levels. They proposed pyramid pooling modules with atrous convolution [34,35]or pooling layers to capture different fine-grained information.The pyramid Pooling Module(PPM)proposed by the Pyramid Scene Parsing Network (PSPNet) [36] attempted to merge coarse features and fine features for better inference.

    For the real-time semantic segmentation task, how to predict fast without sacrificing too much accuracy is the main research direction[37-39].As previously stated,many works focus on purchasing better accuracy with complicated structures and heavyweight components that do not perform well in real-time segmentation tasks.Therefore,many methods use speedup strategies such as downsampling input size,shrinking network channels,and model compression[40].On the one hand,these strategies take advantage of the most critical factors such as image resolution, number of features, and parameters of the network to build the models. On the other hand, their drawbacks are clear they cannot maintain the efficiency of high-resolution images without scarifying accuracy.

    BiSeNet proposed the fusion of semantic features and spatial features with two custom paths and many powerful modules for pursuing better performance under two evaluation metrics of accuracy and speed. We believe the theory of two-branch architecture based on semantic and spatial information benefits real-time segmentation task on high-quality prediction tasks.However,some structures can still be cropped for faster speed without losing too much accuracy. Based on Semantic Information Flow and Fringe Information Flow, we constructed a novel end-to-end trainable network called Dual Flow Network(DuFNet).In addition,we also introduced our cascaded Semantic Information Flow(SIF)with Global Context Module to aggregate semantic information and spatial information.Therefore, through the reasonable combination of multiple components, we designed a two-stream segmentation framework that balances the accuracy and the speed,and thus the proposed method is more suitable for the Unmanned driving Application of the Internet of Things because of the lower computational burden.

    Our main contributions are listed as follows:

    1. We proposed a powerful architecture,DuFNet,for real-time semantic segmentation task.We introduced a classical two-branch model with unique Semantic Information Flow and compact Fringe Information Flow to capture information of different receptive fields. Feature fusion module borrowed from BiSeNet aggregates these features to generate excellent performance.

    2. We also designed parallel SIF and cascaded SIF with different components.The parallel structure makes the network suitable for inference efficient implementation with slight accuracy loss.The cascaded structure is benefit for speed.

    3. We validated the performance of our DuFNet on the Cityscapes dataset with ResNet18 backbone for real-time semantic segmentation. Our method achieves 72.34% mIoU with the speed of 78 FPS at high resolution image with, outperforming existing state-of-the-art methods.

    2 Related Work

    Real-time semantic segmentation has been studied for many years and lots of methods based on deep learning networks have achieved state-of-the-art performance.There are mainly two patterns for real-time semantic segmentation with high-resolution semantic map prediction.The critical difference between them is whether the network is a single-pipeline structure or a multi-branch structure.

    One of these solutions is the Encoder-Decoder structure[41]with a single-pipeline model that has been successfully applied to semantic segmentation task.The encoder part gradually extracts contextual information by layer-by-layer convolution and generates high-dimensional feature representation.The decoder part gradually recovers the spatial information. SegNet [42] is a clear example of this structure. The encoder stage of SegNet is composed of a set of pooling and convolution layers and the decoder is composed of a set of upsampling and convolution layers.Deeplabv3[43],an encoder variant,employs the spatial pyramid pooling module with atrous convolution to encode rich semantic information. Multi-scale feature [44,45] ensembles achieved impressive results in terms of accuracy without speed advantage.

    Another solution to semantic segmentation is the two- or multi-branch architecture. They construct different networks to adapt to different tasks on multi branches and combine all subresults generated by each branch to further improve the performance with acceptable cost[40,46-49].This kind of strategy overcomes weakness that single-pipeline structure cannot take full advantage of information from original image.For real-time semantic segmentation,ICNet[40]proposed a novel three-branch model for real-time semantic segmentation based on in-depth analysis of time budget in common frameworks and extensive experiments. The strategy of feeding low-resolution images into the full CNN and feeding medium-and high-resolution images into light-weighted CNNs is very clever.This architecture not only utilizes semantic information in low resolution along with details from high resolution images efficiently but also achieves faster prediction even on high resolution images.ICNet avoids the insufficiency of intuitive speedup strategies including downsampling input size,shrinking the channels of the network and model compression.

    Two-branch architecture is dedicated to alleviating the problem in encoder-decoder architecture because partial information is lost during the downsampling process.GCN[47]proposed a multiresolution architecture that jointly exploits fine and coarse features and the two branches with partially shared weights achieve better speed performance.Context Aggregation Network[49]proposed a dual branch convolutional neural network with significantly lower computational cost while maintaining competitive accuracy.

    RGPNet [38] proposed encoder-decoder structures for four level outputs from the backbone,respectively. The decoder reconstructs the lost spatial information from multi stages with feature fusion. Inspired by RGPNet, we also propose a parallel structure for context information which extracts semantic representation on four stages of the backbone independently.

    BiSeNet is also an excellent approach of the two-branch.It constructs Semantic Path based on the ResNet18 or Xception backbone and an Attention Refinement Module to obtain large receptive field.While the Semantic Path encodes rich semantic contextual information,the Spatial Path utilizes the lightweight model to provide spatial information.BiSeNet uses Feature Fusion Module(FFM)to fuse the two paths which are different in level of feature representation.We use the same FFM of BiSeNet in our network.Actually,the weaknesses of BiSeNet are obvious that the light-weighted model of Spatial Path still causes certain computation time budget in order to make the features reach a similar level.We try to take full advantage of features which are generated in the process of backbone and utilize easy operators to form the so-called Spatial Path for efficiency. The U-shape cascaded structure used in Context Path does not fully exploit the potential of the network model(like ResNet18 backbone).We also conduct experiments with our cascaded model for extracting context information to demonstrate our speedup strategies.

    3 Approach

    3.1 Structure of DuFNet

    Inspired by the Bilateral Segmentation Architecture of BiSeNet,we propose our DuFNet with two separate paths:Semantic Information Flow(SIF)and Fringe Information Flow(FIF).It comprises two components: an encoder based ResNet18 which takes full advantage of multi-scale contexts in parallel and a tiny lightweight network that reconstructs lost detailed information. The encoder extracts high-level features and generates semantic information with different levels of abstraction at different stages in SIF.The lightweight network shares features from the encoder and estimates lowlevel spatial information in FIF.

    As shown in Fig.1,we illustrate our proposed DuFNet with many modules or layers to be detailed later.Furthermore,we elaborate on how the semantic information and fringe information contained in the features are calculated and transmitted in each module and layer.It is clear that DuFNet just keeps Bassinet’s Bilateral architecture and Feature Fusion Module but designs totally different patterns for aggregating multi-scale contexts in our bilateral structure.

    Figure 1:Overview of our proposed DuFNet.We first use CNN(ResNet18)to extract features of the input image stage by stage,then use multiple convolution layers to concatenate different fine-grained features in parallel.In this process,we aggregate information with different levels of abstraction and the concatenation layer merges all features. Secondly, the pooling layer and convolution layer with upsampling reconstruct spatial information flow from different stages of CNN.Finally,the two flows are fed into the feature fusion module to get a final prediction

    3.2 Semantic Information Flow

    Semantic Information Flow,the encoder,relies on changed ResNet18 as the backbone.It is divided into four stages named with numbers according to whether the resolution of features has changed.In the given diagram Fig.2, features of four stages (stage1, stage2, stage3, and stage4) from ResNet18 backbone are extracted at different spatial resolutions 1/4,1/8,1/16,and 1/32 and with 64,128,256,and 512 channels mentioned in the scheme,respectively.

    With the consideration of multi-scale context aggregation and computation demand simultaneously, we choose ResNet18 as an effective global prior representation for producing better performance and not just gaining accuracy. The residual blocks which address vanishing/exploding gradients also allow the network to easily enjoy the context information flowing through the model.

    Figure 2:The schema of our Parallel Semantic Information Flow.Different from the original BiSeNet,we use paralleled structure and multiple convolution layers

    In order to use context information generated at different stages,we propose a parallel convolution structure containing four convolutions with different receptive fields. For feature maps of stage1,we use a dilated convolution layer with kernel=5, dilation=2 and stride=4 followed by batch normalization and ReLU to refine high-level and low-level information. Dilated convolution, a powerful feature extraction tool, can explicitly adjust field-of-view without extra parameters and computation burden. Feature maps extracted by stage 1 contain more spatial information and less semantic information due to the small receptive field. For stage2, we just use a normal convolution layer with kernel=3 and stride=2 to resize features. For features of stage3 and stage4, we use projection convolution with kernel size to project them so that they have the same number of channels,but we add an upsampling operation followed by the output of stage4.

    Given the different stages of the features, we transform them into feature maps with the same channels (e.g., 128) and the same spatial resolution, i.e., 1/8 of the original input. After that, we concatenate feature maps in all stages for aggregating different fine-grained context information.The concatenated features are fed into Feature Fusion Module to provide sufficient semantic information.

    3.3 Fringe Information Flow

    The goal of the Semantic Segmentation task is to split images into several regions or to confirm the boundaries of objects.Lots of methods encode rich contextual information into coarse prediction maps while missing object boundary details.However,extra spatial information or boundary details are crucial when the model retrieves resolution from prediction maps in the decode stage.

    One of the most interesting approaches, BiSeNet, tries to use a lightweight network with three layers for preserving spatial information. It is no doubt that this idea produces better accuracy without the high computational overhead.However,it could be faster by using an operator with fewer parameters instead of multiple convolution layers.

    Based on experiments, we propose the Fringe Information Flow architecture to preserve the boundary details and encode rich spatial information.Fig.3 shows the details of this design.Different from BiSeNet’s Spatial Path with three convolution layers, we use one convolution layer and an optional pooling layer to recover the missing spatial information in the process of downsampling.The pooling layer focuses on rich object boundary details in the features from stage1 of the backbone.The convolution layer regulates the channels of features after scaling up its spatial size.The FIF model just contains one convolution layer and one optional pooling layer.The feature maps in stage3 usually are rich in contextual information without losing many spatial details. Therefore, some details can be restored on these feature maps through some upsampling operations.We propose upsampling the feature maps of stage3 to produce a powerful representation whose size is 1/8 of the original image only.After that,we use a projection convolution with kernel=1,to just squeeze the channels of feature maps to fit input requirements of Feature Fusion Module. The pooling layer works at feature maps of stage1,and the spatial size of output feature maps is also 1/8 of the original image.The meaning of the pooling feature of stage1 is to supplement the boundary details lost in the deeper stage.

    Figure 3:The schema of BiSeNet(a)and our Fringe Information Flow(b)

    4 Experiments

    We evaluate our proposed model on the publicly available Cityscapes dataset. In addition, we compare the performance with state-of-the-art works and perform ablation analysis for our DuFNet with BiSeNet under multiple evaluation metrics.Experimental results are as follows.

    4.1 Datasets and Implementation Details

    We introduce the public semantic segmentation datasets and show the details of our experimentation.Table 1 shows the settings of the training set,test set,retrieval set and label number of two data sets.During the training,the size of all images is uniformly changed to 224×224.

    Table 1: Speed comparison of our proposed DuFNet against other state-of-the-art methods. “-”indicates that the methods did not give the corresponding speed results

    4.1.1 Cityscapes

    Cityscapes [50] is a large-scale urban street scene understanding dataset to train and test approaches for pixel-level,instance-level and panoptic semantic labeling task.It contains 2975 training and 500 validation images with high quality pixel-level annotations and involves 30 classes(e.g.,road,person,car,wall and sky).

    Implementation Details

    Our architecture is started from the ResNet18 backbone and implemented with PyTorch [51].We use poly strategy that learning rate varies according to base_lr×with base_lr =1e-2 and power = 0.9 in train process.The weight-decay and momentum are set as 5e-4 and 0.9,respectively.For the structure,we make use of cross-entropy loss function.For Cityscapes dataset,we train our model with SGD optimizer with total 19 categories.

    For data augmentation, the image is randomly scaled from 0.75 to 2.0 and randomly rotated between-10 to 10 degrees.Due to the cityscapes dataset having high quality images with a resolution of 1024×2048,we crop the images into 768×1536 for train and valuation process.

    The Cityscapes dataset is divided into 2975/500/1525 images for training,validation and testing respectively with 19 classes.Train and valuation are measured on a server with a single RTX 2080Ti.Limited by the performance of GPU card,the batch size is set as 8 to avoid Out of Memory(OOM)and some strategies which can make experiments efficiently, including multithreading training and Synchronized Multi-GPU Batch Normalization(SyncBN)are disabled.

    Evaluation Metrics

    For evaluation, we use mean intersection over union (mIoU), mIoU with no background(mIoU_no_back) and all pixel accuracy (allAcc) for accurate comparison. In addition, we still care for the speed and use frames per second(FPS)for speed comparison[52].

    wherekrepresents the total number of predicted categories,Piidenotes the number of pixels predicted as classiand the true class is alsoi,namely true positive.Pijrepresents the number of pixels predicted as classibut actually classj,namely false positive,Pjirepresents the number of pixels predicted as classjbut actually classi,namely false negative.

    4.2 Experiments

    We compare our method with other state-of-the-art methods on Cityscapes dataset.We evaluate the accuracy and speed on NVIDIA GTX 1080Ti card with high resolution input(except DFANet A).Among them,all approaches are real-time semantic segmentation models except FCN-8 s and PSPNet which focus on high-quality image segmentation.Results are shown in Table 1[53].

    We choose the best version RN-S*P as the network for efficiency comparison with BiSeNet.For be fair, we use a common train strategy and report average speed from running through 5000 times with a single NVIDIA RTX 2080Ti card.All models are based on ResNet18 backbone with different image resolutions of{360×640,512×1024,720×1280,1080×1920}.

    As shown in Table 2, our model reaches better performance than BiSeNet at different image resolutions during speed test. DuFNet gets much faster inference speed 117.2 FPS with image resolution 720 × 1280 as input. With BiSeNet, our method yields 61.1 FPS while running at high resolution images with 1080 × 1920 and can gain about 43.8% speed improvement. It can be seen that the cascade structure of DuFNet is beneficial to improve the speed without losing the accuracy of the operation,making the network more suitable for inference efficient implementation and more streamlined to operate.As shown in Fig.4,both methods achieve semantic segmentation,which are not different from the ground truth.But our method pays more attention to small targets,such as the car’s logo.In addition,our method has better robustness compared to BiSeNet,with no incomplete regions.

    Table 2: Speed comparison of our proposed DuFNet with different resolutions

    4.3 Ablation Analysis

    In this section,we evaluate the performance of each component independently and compare the overall efficiency of our model in different situations with BiSeNet and other well-known real-time semantic segmentation models.We conduct the ablation analysis on Cityscapes dataset and train all models with the same strategy for fair.

    Figure 4:Results of BiSeNet and our DuFNet on cityscapes validation dataset.(a)Image.(b)Ground Truth.(c)BiSeNet.(d)Ours

    4.3.1 Ablation for FIF

    The light-weighted network has been shown in famous works to be beneficial in providing rich spatial information with a low computation cost.In pursuit of faster performance,we propose a novel model with less computation and conduct experiments while preserving the Context Path of BiSeNet for comparison[54].

    One form of FIF is combining features from stage1 of the backbone after pooling layer and stage3 of backbone after upsampling as illustrated in Fig.3b,called RN-S*.As shown in Table 3,our FIF achieves 72.79% mIoU and 95.27% allAcc with the same setting. The accuracy of FIF drops about 0.2% in accuracy metrics, but it reaches 178.3 FPS with 27.7 absolute improvement compared with BiSeNet(150.6 FPS).

    Table 3: Detailed performance comparison of Fringe Information Flow in our proposed DuFNet

    Other novel form of FIF is only using features from stage3 of backbone with upsampling and project convolution,called RN-S.As shown in Table 3,BiSeNet achieves 73.01%mIoU and 95.42%allAcc and 150.6 FPS.However,using our light FIF exceeds BiSeNet by 32.8 FPS and reaches 183.4 FPS.And the mIoU reaches 72.88%which is almost close to BiSeNet and only 0.13%drop.Compared with RN-S*,it yields fast inference because of the lack of computation produced by pooling layer.

    The reason for using features of the backbone is twofold.First,the model does not need to use a lightweight network to extract features for spatial information,and this avoids a certain time budget.Second,the stage3 of the backbone can recover a certain extent of context and spatial information at the same time and rich object details for stage1.

    4.3.2 Ablation for SIF

    Convolution is a very flexible operator and can be easily deployed in any situation which needs to decrease or increase feature maps in spatial.Building on top of this idea,we propose our Parallel Semantic Information Flow with many types of convolutions including Atrous Convolution to yield stronger semantic representation.We concatenate all the feature maps from each stage of the ResNet18 after multi types of the convolution layer to generate global context representation.Some details are shown in Table 4.

    Table 4: Detailed performance comparison of parallel

    In order to independently evaluate our parallel SIF,we migrate it to BiSeNet and keep the original Spatial Path,called RN-P.Table 4 shows that our SIF can yield faster inference at the cost of sacrificing prediction accuracy and get score 164.9 in FPS.In addition,we also unite our FIF with parallel SIF to investigate the performance, called RN-SP and RN-S*P. As shown in Table 4, the accuracy of them is close.RN-SP achieves 71.36/95.00 in terms of mIoU and allAcc(%)and 189.6 in FPS which outperforming BiSeNet by 39 FPS.The most important is that the mIoU of RN-S*P drops to 72.34%,but the speed is still good enough to outperform the BiSeNet with an absolute improvement of 45.1 in FPS.Different from the parallel SIF mentioned above,we also proposed SIF with a cascaded structure.Fig.5a shows the details of this design.In this part,we deploy U-shape structure to encode the features of different stages.

    Global Semantic Module:Inspired by the context module of BiSeNet,we also utilize projection convolution and average pooling to construct our module for capturing more global context information[55].As Fig.5b shows,a global average pooling abstracts global context information by adopting varying-size pooling sizes of 1×1,2×2,3×3 and 6×6,respectively.After that,we recover the features automatically by upsampling on spatial size, and thus features can be added into input features by pixel-wise summation.

    Global average pooling is a useful model as the global contextual prior,which is commonly applied in semantic segmentation tasks[36,56].The strategy which we use to put pooling before convolution reduces the number of parameters and computational cost in our model.However,the most important significance is expected to extract more useful context information and makes a powerful global prior representation.

    Inspired by the multi-scale parallel spatial pooling structure of PSPNet, we perform multiple cascade pooling layers with different downsampling sizes on the sub-decode stages. It is a classical U-shape structure with shortcut layer.For deeper feature maps,we use a GSM module with a smaller pooling size,and use a GSM module with larger pooling size for lower feature maps.For example,the pooling layer inside the GSM module marked with 1 will pool the feature of stage4 into a single bin feature map. And other pooling layers inside GSMs will generate different scale outputs as marked with different colors.

    Figure 5:The schema of our Cascade Semantic Information Flow(a).Different from original BiSeNet,we use totally different structure and our custom Global Semantic Module(GSM).The GSM module contains only one global pooling layer which generates different sizes bin output of {1, 2, 3, 6}, as illustrated in part(b)

    Auxiliary Branch:The auxiliary branches are added after the fusion of features that come from the adjacent stages.The auxiliary branches help to optimize the learning process at different levels.We still use BiSeNet’s Segmentation Heads to deal with auxiliary branch outputs with tiny adjustments.We also add weights to balance the auxiliary loss for accuracy.In addition,we abandon all the auxiliary branches and use the master branch for the final prediction during the testing stage.

    In the experiment, we also introduce the Cascade Semantic Information Flow (CSIF) to investigate the performance of the structure like U-shape. Where we use a lightweight model, ResNet18,as the backbone and keep the Spatial Path of BiSeNet for comparison. In addition, we propose the Global Semantic Module with four different factors(noted as G1,G2, G3 and G6)to combine the feature outputs from each stage in ResNet18 network, called RN - C1236. The numbers represent the pooling size of pooling layer inside the GSM module.To further evaluate the performance of the component,we also conduct experiments with several settings,including removing G3 and replacing G6 with G3 or G4, called RN-C126, RN-C123and RN-C124, respectively. As listed in Table 5,RN-C1236works worse than BiSeNet on all metrics. Too many GSMs increase the computational burden and reduce the efficiency and parameters of the components are not set reasonably.However,RN-C124yields good result 161.49 in FPS with a little bit of accuracy loss.And compared to BiSeNet,it achieves 72.39%/95.05%in terms of mIoU and all Acc and is about 0.6%lower than BiSeNet.

    Table 5:Detailed performance comparison of Cascade Semantic Information Flow in our proposed DuFNet

    Based on the FIF we mentioned before,we also conducted experiments to verify the improvement of the unit.As listed in Table 6,RN-S*C1236yields 72.12/95.04 in terms of mIoU and allAcc and 168.0 in FPS.RN-S*C124has a better performance with 72.70/175.2 in terms of mIoU and FPS.

    Table 6: Detailed performance comparison of Cascade SIF and FIF in our proposed DuFNet

    The CSIF which is inspired by Pyramid Pooling Module of PSPNet[57]shows its efficiency on speed and also drawbacks of cascaded structure. Therefore, we propose a better Parallel Semantic Segmentation Flow.

    4.4 Results

    As our proposed architecture,RN-S*P,achieves comparable performance on multiple evaluation metrics.We evaluate the segmentation results on the server with a single RTX 2080Ti.

    4.4.1 DuFNets vs.BiSeNet

    Compared to the well-known real-time semantic segmentation model BiSeNet,DuFNet achieves absolute inference speed improvements with a slight sacrifice of accuracy. In Fig.5, the accuracy of DuFNet is close to BiSeNet, which is shown in the object classification mission and the accurate prediction of boundary details.The quantitative results are summarized in Table 4.

    In terms of inference speed,our DuFNets achieve a significant increase in speed with high-quality image segmentation results.Exemplary results are shown in Fig.6.

    Figure 6:Accuracy and inference speed of BiSeNet and our DuFNets

    5 Conclusion

    We have proposed a novel architecture called DuFNet for real-time semantic segmentation tasks.The Fringe Information Flow takes advantage of the features of the backbone and reconstructs the spatial information with a light-weighted structure. The Cascade Semantic Information Flow enhances the quality of context encodings throughout its feature hierarchy with custom modules.The Parallel Semantic Information Flow enables the network to have better representational power by fusing spatial and context features more efficiently and contributes to the prior global representation generated by the feature fusion module. A wide range of experiments show the effectiveness of DuFNet, which achieves comparable performance on the Cityscape dataset. In the real world,autonomous driving and other applications still have a high demand for speed and accuracy of realtime semantic segmentation.The great trade-off between segmentation accuracy and inference speed will foster further research in this field.

    Funding Statement:This work is supported in part by the National Key RD Program of China(2021YFF0602104-2, 2020YFB1804604), in part by the 2020 Industrial Internet Innovation and Development Project from Ministry of Industry and Information Technology of China,and in part by the Fundamental Research Fund for the Central Universities(30918012204,30920041112).

    Conflicts of Interest:The authors declare that they have no conflicts of interest to report regarding the present study.

    国产精品电影一区二区三区| 一进一出好大好爽视频| 老汉色av国产亚洲站长工具| 国产高清有码在线观看视频 | 一本一本综合久久| 精品一区二区三区av网在线观看| 在线观看免费视频日本深夜| 啦啦啦免费观看视频1| 露出奶头的视频| 香蕉国产在线看| 日本在线视频免费播放| 一级a爱视频在线免费观看| 亚洲国产精品sss在线观看| 久久久水蜜桃国产精品网| 大香蕉久久成人网| 免费观看精品视频网站| 亚洲国产高清在线一区二区三 | 十分钟在线观看高清视频www| 窝窝影院91人妻| 久热爱精品视频在线9| 国产精品香港三级国产av潘金莲| 精品久久久久久久末码| 日本一本二区三区精品| 午夜视频精品福利| a级毛片a级免费在线| 欧美日韩福利视频一区二区| 中文字幕高清在线视频| 亚洲美女黄片视频| 欧美zozozo另类| 久久久久国产一级毛片高清牌| 亚洲国产日韩欧美精品在线观看 | 黑丝袜美女国产一区| 桃红色精品国产亚洲av| 亚洲在线自拍视频| 日韩欧美 国产精品| 国产精品一区二区三区四区久久 | 国产黄片美女视频| 91麻豆精品激情在线观看国产| 最近最新免费中文字幕在线| 级片在线观看| 国产99白浆流出| 午夜免费观看网址| 久9热在线精品视频| 精品一区二区三区av网在线观看| 亚洲 国产 在线| 19禁男女啪啪无遮挡网站| 国产av又大| 国产精品一区二区免费欧美| 999久久久国产精品视频| 国产黄a三级三级三级人| 后天国语完整版免费观看| 亚洲人成网站高清观看| 亚洲国产欧美日韩在线播放| 久久久精品欧美日韩精品| aaaaa片日本免费| 午夜日韩欧美国产| 日韩成人在线观看一区二区三区| 欧美成人性av电影在线观看| 国产片内射在线| 麻豆成人av在线观看| 丰满的人妻完整版| 免费在线观看黄色视频的| 成人免费观看视频高清| 美女 人体艺术 gogo| 久久 成人 亚洲| 午夜福利高清视频| 黑丝袜美女国产一区| 亚洲成a人片在线一区二区| 色播在线永久视频| 久久久国产欧美日韩av| 亚洲av第一区精品v没综合| 可以免费在线观看a视频的电影网站| 香蕉久久夜色| 欧美在线黄色| 免费看十八禁软件| 欧美最黄视频在线播放免费| 免费看a级黄色片| 国产精品av久久久久免费| 亚洲免费av在线视频| 成人永久免费在线观看视频| 亚洲精品久久成人aⅴ小说| 一个人观看的视频www高清免费观看 | 亚洲avbb在线观看| 夜夜看夜夜爽夜夜摸| 中文字幕人妻丝袜一区二区| 99久久精品国产亚洲精品| 搡老熟女国产l中国老女人| 最好的美女福利视频网| 51午夜福利影视在线观看| 欧美一级毛片孕妇| 成人特级黄色片久久久久久久| 亚洲男人天堂网一区| 国产av又大| 亚洲av中文字字幕乱码综合 | 午夜福利成人在线免费观看| 在线观看www视频免费| 男女床上黄色一级片免费看| x7x7x7水蜜桃| www日本在线高清视频| 男女下面进入的视频免费午夜 | 免费在线观看视频国产中文字幕亚洲| 日韩欧美一区视频在线观看| 亚洲色图 男人天堂 中文字幕| 窝窝影院91人妻| 波多野结衣高清无吗| 国产精品av久久久久免费| 特大巨黑吊av在线直播 | 久久香蕉激情| 成人一区二区视频在线观看| 国产黄片美女视频| 黄色毛片三级朝国网站| 亚洲成av人片免费观看| 又大又爽又粗| 久久久国产成人免费| 男人的好看免费观看在线视频 | 久久午夜亚洲精品久久| 国产v大片淫在线免费观看| 久久 成人 亚洲| 亚洲精品在线美女| 18禁黄网站禁片免费观看直播| 啦啦啦 在线观看视频| 老鸭窝网址在线观看| 在线看三级毛片| 精品国内亚洲2022精品成人| 欧美精品亚洲一区二区| 国产亚洲精品第一综合不卡| 国产欧美日韩精品亚洲av| 亚洲五月天丁香| 久久草成人影院| 国产成+人综合+亚洲专区| 神马国产精品三级电影在线观看 | 亚洲熟妇中文字幕五十中出| 国产亚洲精品久久久久久毛片| 午夜免费成人在线视频| 日韩欧美三级三区| а√天堂www在线а√下载| 亚洲五月婷婷丁香| 精品欧美一区二区三区在线| 香蕉丝袜av| 午夜福利视频1000在线观看| 美女 人体艺术 gogo| 国产av一区在线观看免费| 日本a在线网址| 亚洲人成电影免费在线| 丰满人妻熟妇乱又伦精品不卡| 韩国av一区二区三区四区| 亚洲人成网站在线播放欧美日韩| 免费电影在线观看免费观看| 天天躁夜夜躁狠狠躁躁| 制服丝袜大香蕉在线| 国产精品久久电影中文字幕| 成人国产一区最新在线观看| 亚洲精华国产精华精| 啦啦啦观看免费观看视频高清| 嫩草影视91久久| 搡老妇女老女人老熟妇| 亚洲欧美日韩高清在线视频| www.精华液| 午夜激情福利司机影院| 女人爽到高潮嗷嗷叫在线视频| 天堂动漫精品| 亚洲成人久久性| 又黄又粗又硬又大视频| 国内精品久久久久精免费| 免费在线观看亚洲国产| 日韩高清综合在线| 听说在线观看完整版免费高清| 国产欧美日韩一区二区三| 成人特级黄色片久久久久久久| 两性午夜刺激爽爽歪歪视频在线观看 | 亚洲av熟女| 亚洲精品粉嫩美女一区| 美女高潮喷水抽搐中文字幕| 香蕉av资源在线| 欧美日韩乱码在线| 亚洲熟妇熟女久久| 在线永久观看黄色视频| 一区二区日韩欧美中文字幕| www日本黄色视频网| 日本五十路高清| 人人妻,人人澡人人爽秒播| 黄网站色视频无遮挡免费观看| 激情在线观看视频在线高清| 亚洲自偷自拍图片 自拍| 久久 成人 亚洲| 高潮久久久久久久久久久不卡| 级片在线观看| 欧美另类亚洲清纯唯美| 正在播放国产对白刺激| 久久草成人影院| 国产麻豆成人av免费视频| 一本久久中文字幕| 亚洲五月婷婷丁香| 俺也久久电影网| 午夜激情福利司机影院| 两性夫妻黄色片| 两性夫妻黄色片| 亚洲精华国产精华精| 大型av网站在线播放| 国产黄色小视频在线观看| 久久久精品欧美日韩精品| videosex国产| 丝袜美腿诱惑在线| 俄罗斯特黄特色一大片| 丝袜美腿诱惑在线| 亚洲七黄色美女视频| a级毛片a级免费在线| 视频区欧美日本亚洲| 黄色视频,在线免费观看| 国产精品自产拍在线观看55亚洲| 99国产综合亚洲精品| or卡值多少钱| 国产亚洲精品久久久久5区| 欧美日韩亚洲综合一区二区三区_| www.www免费av| 国产97色在线日韩免费| 久久热在线av| 在线十欧美十亚洲十日本专区| 麻豆久久精品国产亚洲av| 一a级毛片在线观看| 女生性感内裤真人,穿戴方法视频| 国产成人精品久久二区二区91| 一本一本综合久久| 婷婷丁香在线五月| 在线国产一区二区在线| 黄色成人免费大全| 神马国产精品三级电影在线观看 | 波多野结衣高清无吗| 久久婷婷人人爽人人干人人爱| 亚洲人成77777在线视频| 亚洲成av人片免费观看| 天天添夜夜摸| 国产精品乱码一区二三区的特点| 亚洲第一欧美日韩一区二区三区| 日韩有码中文字幕| 精品一区二区三区四区五区乱码| cao死你这个sao货| 欧美日韩福利视频一区二区| 亚洲av电影不卡..在线观看| 91在线观看av| 亚洲国产欧美日韩在线播放| 在线永久观看黄色视频| 51午夜福利影视在线观看| 精品欧美国产一区二区三| 亚洲avbb在线观看| 国产亚洲精品av在线| 亚洲电影在线观看av| 国产av一区二区精品久久| 亚洲一码二码三码区别大吗| av片东京热男人的天堂| 久久热在线av| 免费无遮挡裸体视频| 婷婷精品国产亚洲av在线| 久久精品aⅴ一区二区三区四区| 久久中文字幕一级| 久久草成人影院| 成人永久免费在线观看视频| 成人av一区二区三区在线看| 日韩欧美一区二区三区在线观看| av在线天堂中文字幕| 成年人黄色毛片网站| www.精华液| 亚洲自偷自拍图片 自拍| 免费看a级黄色片| 一二三四社区在线视频社区8| 搡老妇女老女人老熟妇| 亚洲免费av在线视频| 亚洲在线自拍视频| 老熟妇乱子伦视频在线观看| 波多野结衣av一区二区av| 哪里可以看免费的av片| 精品久久久久久久末码| 最新在线观看一区二区三区| 真人做人爱边吃奶动态| 亚洲中文字幕一区二区三区有码在线看 | 国产高清激情床上av| 欧美亚洲日本最大视频资源| 天天躁狠狠躁夜夜躁狠狠躁| 亚洲性夜色夜夜综合| 精品欧美国产一区二区三| 精品第一国产精品| 亚洲熟女毛片儿| 在线视频色国产色| 俄罗斯特黄特色一大片| 一级作爱视频免费观看| 最近最新中文字幕大全电影3 | 国产在线观看jvid| 色综合站精品国产| 在线观看日韩欧美| 人妻丰满熟妇av一区二区三区| 精品不卡国产一区二区三区| avwww免费| 一夜夜www| 黄色片一级片一级黄色片| 最近最新中文字幕大全电影3 | avwww免费| 久久精品成人免费网站| 久久久久国产一级毛片高清牌| 中文资源天堂在线| 99国产极品粉嫩在线观看| 久久久久久大精品| 国产麻豆成人av免费视频| 99精品在免费线老司机午夜| 一区二区三区激情视频| av欧美777| 51午夜福利影视在线观看| 欧美中文综合在线视频| 亚洲一区中文字幕在线| 亚洲精品在线观看二区| 久久人妻福利社区极品人妻图片| 国产真实乱freesex| 观看免费一级毛片| 午夜两性在线视频| 久热这里只有精品99| 一个人观看的视频www高清免费观看 | 日韩大码丰满熟妇| 午夜成年电影在线免费观看| 99国产精品一区二区三区| 国产亚洲精品第一综合不卡| 国产精品日韩av在线免费观看| 99久久精品国产亚洲精品| 丁香六月欧美| 婷婷精品国产亚洲av在线| 一夜夜www| 国产久久久一区二区三区| 一二三四在线观看免费中文在| 男人舔女人下体高潮全视频| 在线观看免费日韩欧美大片| 亚洲欧美日韩高清在线视频| 男人舔女人的私密视频| 国产欧美日韩精品亚洲av| 99精品久久久久人妻精品| 久久青草综合色| ponron亚洲| 免费高清在线观看日韩| av片东京热男人的天堂| 欧美性长视频在线观看| 亚洲片人在线观看| 日本 欧美在线| 啦啦啦观看免费观看视频高清| 久久久久久免费高清国产稀缺| 亚洲成人国产一区在线观看| 18美女黄网站色大片免费观看| 啦啦啦韩国在线观看视频| 波多野结衣高清无吗| 69av精品久久久久久| 精品久久久久久久久久久久久 | 中文字幕人妻丝袜一区二区| 午夜激情福利司机影院| www日本在线高清视频| 色综合亚洲欧美另类图片| 国产真实乱freesex| 亚洲最大成人中文| 18禁国产床啪视频网站| 国内精品久久久久久久电影| 国产精品1区2区在线观看.| 国产欧美日韩精品亚洲av| 国产97色在线日韩免费| 在线十欧美十亚洲十日本专区| 国产成年人精品一区二区| 亚洲成国产人片在线观看| 日韩中文字幕欧美一区二区| 国产亚洲欧美在线一区二区| 欧美一级毛片孕妇| 国产99白浆流出| 亚洲九九香蕉| 亚洲天堂国产精品一区在线| 国产高清视频在线播放一区| 日本a在线网址| 黑人欧美特级aaaaaa片| 亚洲全国av大片| 国内毛片毛片毛片毛片毛片| 美女扒开内裤让男人捅视频| 亚洲全国av大片| 狠狠狠狠99中文字幕| 制服人妻中文乱码| 亚洲全国av大片| 丝袜人妻中文字幕| 美国免费a级毛片| 午夜福利欧美成人| 一区二区三区精品91| 91九色精品人成在线观看| 日韩欧美一区二区三区在线观看| 欧美日韩一级在线毛片| 亚洲午夜精品一区,二区,三区| 在线观看一区二区三区| av福利片在线| 亚洲va日本ⅴa欧美va伊人久久| 深夜精品福利| 亚洲成a人片在线一区二区| 亚洲熟女毛片儿| 亚洲欧美一区二区三区黑人| 91老司机精品| 波多野结衣av一区二区av| 国产欧美日韩一区二区精品| 中文字幕精品免费在线观看视频| 免费在线观看完整版高清| 欧美色欧美亚洲另类二区| 在线观看免费视频日本深夜| 欧美激情极品国产一区二区三区| 老司机福利观看| 12—13女人毛片做爰片一| 免费人成视频x8x8入口观看| 人妻丰满熟妇av一区二区三区| 巨乳人妻的诱惑在线观看| 亚洲人成电影免费在线| 成年版毛片免费区| 国产私拍福利视频在线观看| 欧美绝顶高潮抽搐喷水| 午夜福利欧美成人| 亚洲成人久久爱视频| 又紧又爽又黄一区二区| 成在线人永久免费视频| 日本三级黄在线观看| 久久精品人妻少妇| 在线观看午夜福利视频| 中文资源天堂在线| 亚洲精品色激情综合| 亚洲免费av在线视频| 国产亚洲av高清不卡| 亚洲狠狠婷婷综合久久图片| 最新美女视频免费是黄的| 午夜影院日韩av| 国产欧美日韩一区二区精品| 亚洲午夜精品一区,二区,三区| 成人亚洲精品一区在线观看| 91大片在线观看| 波多野结衣av一区二区av| 99精品久久久久人妻精品| 老熟妇乱子伦视频在线观看| 久久久国产精品麻豆| 九色国产91popny在线| 精华霜和精华液先用哪个| 中文字幕人妻熟女乱码| 黄色成人免费大全| 亚洲av电影在线进入| 最新美女视频免费是黄的| 午夜影院日韩av| 国产真实乱freesex| 亚洲精品一卡2卡三卡4卡5卡| 欧美性猛交黑人性爽| 国产av一区在线观看免费| 亚洲专区中文字幕在线| av超薄肉色丝袜交足视频| 好看av亚洲va欧美ⅴa在| 欧美激情 高清一区二区三区| 婷婷丁香在线五月| 日韩精品青青久久久久久| 成人手机av| 俺也久久电影网| 欧洲精品卡2卡3卡4卡5卡区| 精品国产超薄肉色丝袜足j| 丰满的人妻完整版| 国产日本99.免费观看| 欧美激情久久久久久爽电影| 国产又色又爽无遮挡免费看| 操出白浆在线播放| 国产视频一区二区在线看| av视频在线观看入口| 窝窝影院91人妻| 久久99热这里只有精品18| 少妇的丰满在线观看| ponron亚洲| 在线观看免费日韩欧美大片| 婷婷亚洲欧美| 欧美日韩亚洲综合一区二区三区_| 真人做人爱边吃奶动态| 99热只有精品国产| 亚洲欧美精品综合久久99| 婷婷精品国产亚洲av在线| 久久久国产成人免费| 日本成人三级电影网站| 少妇裸体淫交视频免费看高清 | 黄片大片在线免费观看| 国产亚洲精品一区二区www| 欧美激情极品国产一区二区三区| 校园春色视频在线观看| 好男人电影高清在线观看| 日韩高清综合在线| 久久久久国产精品人妻aⅴ院| 久久久精品欧美日韩精品| 制服人妻中文乱码| 成人三级黄色视频| 国产成人一区二区三区免费视频网站| 精品一区二区三区视频在线观看免费| 大型av网站在线播放| 女人爽到高潮嗷嗷叫在线视频| 免费电影在线观看免费观看| av片东京热男人的天堂| 亚洲精品在线美女| 中国美女看黄片| 国产亚洲精品一区二区www| 亚洲精品美女久久久久99蜜臀| www.熟女人妻精品国产| 一级a爱片免费观看的视频| 日本撒尿小便嘘嘘汇集6| 老熟妇仑乱视频hdxx| 男人的好看免费观看在线视频 | 天天一区二区日本电影三级| 99在线人妻在线中文字幕| 国产片内射在线| 国产真人三级小视频在线观看| xxxwww97欧美| 日韩欧美 国产精品| 69av精品久久久久久| 免费在线观看视频国产中文字幕亚洲| 亚洲熟妇熟女久久| 日本一本二区三区精品| 国产蜜桃级精品一区二区三区| 狂野欧美激情性xxxx| 99国产极品粉嫩在线观看| 午夜精品久久久久久毛片777| 亚洲熟女毛片儿| 日韩大码丰满熟妇| 亚洲av五月六月丁香网| 日韩视频一区二区在线观看| 亚洲 欧美一区二区三区| 国产蜜桃级精品一区二区三区| 人妻久久中文字幕网| 亚洲美女黄片视频| 最新在线观看一区二区三区| 日韩免费av在线播放| 两性夫妻黄色片| 一二三四在线观看免费中文在| 91av网站免费观看| 香蕉久久夜色| 国产精品爽爽va在线观看网站 | 日韩免费av在线播放| 少妇裸体淫交视频免费看高清 | 最近在线观看免费完整版| 国产精品98久久久久久宅男小说| 丝袜美腿诱惑在线| 精品国产国语对白av| 亚洲人成77777在线视频| 人妻丰满熟妇av一区二区三区| 欧美午夜高清在线| 午夜福利成人在线免费观看| 免费看a级黄色片| 欧美黄色淫秽网站| 最好的美女福利视频网| 操出白浆在线播放| 大型黄色视频在线免费观看| 国产色视频综合| 中文亚洲av片在线观看爽| 久久天躁狠狠躁夜夜2o2o| 老司机在亚洲福利影院| 51午夜福利影视在线观看| 国产伦在线观看视频一区| 免费观看人在逋| 欧美 亚洲 国产 日韩一| 免费看美女性在线毛片视频| 欧美日本视频| 黄色视频,在线免费观看| 国产伦在线观看视频一区| 丝袜美腿诱惑在线| 91九色精品人成在线观看| 久久中文字幕人妻熟女| 成人永久免费在线观看视频| 国产精品爽爽va在线观看网站 | 91成年电影在线观看| 精品一区二区三区av网在线观看| 精品国产美女av久久久久小说| 亚洲第一欧美日韩一区二区三区| 国产亚洲精品第一综合不卡| 最近在线观看免费完整版| 欧美成人午夜精品| 久久精品91无色码中文字幕| 欧美不卡视频在线免费观看 | 少妇被粗大的猛进出69影院| 国产精品久久电影中文字幕| 亚洲国产精品999在线| 成年女人毛片免费观看观看9| 亚洲美女黄片视频| 在线av久久热| 少妇粗大呻吟视频| 久久香蕉激情| 999精品在线视频| 国产成人影院久久av| 一级a爱视频在线免费观看| 国产成人精品久久二区二区91| av电影中文网址| 制服丝袜大香蕉在线| netflix在线观看网站| 成在线人永久免费视频| 精品卡一卡二卡四卡免费| 亚洲av日韩精品久久久久久密| 久久青草综合色| 哪里可以看免费的av片| 夜夜躁狠狠躁天天躁| 丰满的人妻完整版| 国产高清videossex| 久久精品国产亚洲av香蕉五月| 精品久久久久久久久久久久久 | 视频在线观看一区二区三区| 亚洲精品一卡2卡三卡4卡5卡| 在线观看午夜福利视频| 午夜激情av网站| 午夜福利视频1000在线观看| 久久久久久人人人人人| bbb黄色大片| 亚洲 国产 在线| 欧美色欧美亚洲另类二区| 国产精品久久久av美女十八| 成人三级做爰电影| 国内揄拍国产精品人妻在线 | 中文亚洲av片在线观看爽| 久久性视频一级片| 免费在线观看影片大全网站| 国内精品久久久久久久电影| 久久人人精品亚洲av| 亚洲男人的天堂狠狠| 在线免费观看的www视频| 99精品在免费线老司机午夜| 国产精品av久久久久免费| 国产精品国产高清国产av|