By xtLytics LLC
- Introduction
- BI-RADS ©
- Assessment Categories of BI-RADS ©
- Breast Composition Categories
- Devices
- Work Flow
- Discussions
- Documentation for individual models
- Evidences
- News
- References
- Credits
CADt-net is a Neural Network created to identify region of cancerous lesions(ROI: Region of Interest) and classify Ultrasound Breast Images into their respective BI-RADS© Scores. CADt-net is used to develop a proprietary mobile app
BI-RADS© [1] is an acronym for Breast Imaging-Reporting and Data System, a quality assurance tool originally designed for use with mammography. The system is a collaborative effort of many health groups but is published and trademarked by the American College of Radiology [2] ^ (ACR).
While BI-RADS is a quality control system, in day-to-day usage the term "BI-RADS" refers to the mammography assessment categories. These are standardized numerical codes typically assigned by a radiologist after interpreting a mammogram. This allows for concise and unambiguous understanding of patient records between multiple doctors and medical facilities.
The assessment categories were developed for mammography and later adapted for use with MRI and Ultrasound findings. The summary of each category, given below, is nearly identical for all 3 modalities.
Category 6 was added in the 4th edition of the BI-RADS.
BI-RADS Assessment Categories are:
BI-RADS © Score | Inference |
---|---|
0 | Incomplete |
1 | Negative |
2 | Benign |
3 | Probably benign |
4 | Suspicious |
5 | Highly suggestive of malignancy |
6 | Known biopsy – proven malignancy |
An incomplete (BI-RADS 0) classification warrants either an effort to ascertain prior imaging for comparison or to call the patient back for additional views and/or higher quality films. A BI-RADS classification of 4 or 5 warrants biopsy to further evaluate the offending lesion.[3] ^
Some experts believe that the single BI-RADS 4 classification does not adequately communicate the risk of cancer to doctors and recommend a sub-classification scheme:[4]
BI-RADS © Score | Inference |
---|---|
4A | low suspicion for malignancy, about 2% |
4B | intermediate suspicion of malignancy, about 10% |
4C | moderate concern, but not classic for malignancy, about 50% |
As of the BI-RADS 5th edition [5]
Categories | Analysis |
---|---|
a. | The breasts are almost entirely fatty |
b. | There are scattered areas of fibro-glandular density |
c. | The breasts are heterogeneously dense, which may obscure small masses |
d. | The breasts are extremely dense, which lowers the sensitivity of mammography |
- Status Code
- Status
- Output
- Color
- Kappa Score
- Cohen's Kappa (Similar to Kappa Score)
- Fleiss' Kappa
- Positive Predictive Value
- Jaccard Coefficient
- Probability Score
- Sensitivity
- Cut-off Values
Time Elapsed by v1.0
Parameters | Values |
---|---|
Cohen's Kappa Score | 0.89 |
Jaccard Coefficient | 0.90 |
Specificity | 0.958295 |
Sensitivity | 0.920495 |
Confusion Matrix allows us to quantify the metrics which help in the evaluation of a Classification Model. For Medical use-cases we require the FN (False Negatives) to be the minimum. As mentioned in the example below, we see that a False Negative can have adverse affects on the situation (here: Pregnancy) of the individual.
In Cancer cases this cannot be taken lightly and the FN cases should be minimized and state-of-the-art tools cannot provide Diagnostics but can provide Triage .
BI-RADS | 0 | 1 | 2 | 3 | 4 | 4a | 4b | 4c | 5 | 6 |
---|---|---|---|---|---|---|---|---|---|---|
0 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
1 | 0 | 7 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
2 | 0 | 0 | 325 | 5 | 6 | 0 | 2 | 4 | 10 | 0 |
3 | 0 | 0 | 10 | 170 | 7 | 1 | 2 | 4 | 8 | 0 |
4 | 0 | 0 | 7 | 5 | 230 | 0 | 0 | 10 | 9 | 0 |
4a | 0 | 0 | 1 | 1 | 4 | 56 | 0 | 3 | 2 | 0 |
4b | 0 | 0 | 1 | 0 | 1 | 0 | 64 | 2 | 3 | 0 |
4c | 0 | 0 | 3 | 0 | 8 | 0 | 0 | 163 | 14 | 0 |
5 | 0 | 0 | 7 | 5 | 2 | 1 | 2 | 13 | 423 | 0 |
6 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 2 | 21 |
BI-RADS | Benign | Malignant |
---|---|---|
Benign | ||
Malignant |
BIRAD
Methods | Device | Skillset | Cost | Device Cost | Precision | Accuracy |
---|---|---|---|---|---|---|
Manual scanning of the affected region | GE Vscan [6] | Semi-skilled | $15 | $7900 | NA | |
Manual Screening of the affected Region | Philips Lumify [7] | Semi-skilled | $15 | $7000 | NA |
The work flow for the whole project and the inner working pipeline has been provided and described as below
Labor | Cost |
---|---|
Unskilled Labor | Competitive |
Semi-skilled Labor | Competitive |
Skilled Labor | Competitive |
This section contains the discussions on various Neural Networks Architecture.
Type of Network | Detail of Network | Pros | Cons |
---|---|---|---|
Deep Neural Network | There are more than 2 layers which allow complex and non-linear relationship. It is used for classification as well for regression. | It is widely used with great accuracy | If the provided computing power is enough then only the model can be trained |
Convolutional Neural Network | Works for 2 dimensional data, majorly for image data. | For images the model is industrial standard | Needs labelled data for classification. |
Faster R-CNN | |||
YOLO |
Model | Size | Top-1 Accuracy | Top-5 Accuracy | Parameters | Depth | Pros | Cons |
---|---|---|---|---|---|---|---|
VGG16 | 528 MB | 0.713 | 0.901 | 138,357,544 | 23 | Good Accuracy | Huge |
VGG19 | 549 MB | 0.713 | 0.900 | 143,667,240 | 26 | - | - |
ResNet50 | 98 MB | 0.749 | 0.921 | 25,636,712 | - | Documentation Availability and Large amount of Parameters | Heavy for Mobile Applications |
InceptionV3 | 92 MB | 0.779 | 0.937 | 23,851,784 | 159 | - | - |
MobileNet V1 | 16 MB | 0.704 | 0.895 | 4,253,864 | 88 | Small Size | Accuracy and Parameters |
DenseNet121 | 33 MB | 0.750 | 0.923 | 8,062,504 | 121 | - | - |
Unet [15][16] | 188 MB | 0.82 | 0.932 | - | - | Extreme ROI identification | Implementation in PyTorch which inhibits the implementation on Android, Manual Tagging of points inside the ROI |
One of the biggest issues is so called black-box problem, although math used to construct a neural network is straight forward but how the output was arrived is exceedingly complicated i.e. machine learning algorithms get bunch of data as input, identify patterns and build predictive model but understanding how the model worked is issue. The deep learning model is often uninterpretable and most of the researchers are using it without know the working process that why it provides better result.[11] ^
Below listed are tested on the trained model Cadt-net which is based on the Neural Network Architecture of ResNet50 as mentioned above in Documentation for individual models.
Philips | GE Healthcare |
---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Based on the study [9] ^ performed in Guadalajara, the test Ultrasound images taken from the GE VScan Devices
Images | Images | Images |
---|---|---|
![]() |
![]() |
![]() |
Confidence: 99.99803305 | Confidence: 99.99915361 | Confidence: 99.99226332 |
Based on Forbes Article
[8] ^ by Robert Pearl, M.D.[10] ^
By contrast, “Machine Learning” relies on neural networks (a computer system modeled on the human brain). Such applications involve multilevel probabilistic analysis, allowing computers to simulate and even expand on the way the human mind processes data. As a result, not even the programmers can be sure how their computer programs will derive solutions.
A pair of independent studies found that 50% to 63% of U.S. women who get regular mammograms over 10 years will receive at least one “false-positive” (a test result that wrongly indicates the possibility of cancer, thus requiring additional testing and, sometimes, unnecessary procedures). As much as one-third of the time, two or more radiologists looking at the same mammography will disagree on their interpretation of the results.
This states that the BI-RADS© stated even in the training data provided to the model from Radiologists may also not be TRUE. Hence, this alters the chances of Model actually providing the Ground Truth.
Visual pattern recognition software, which can store and compare tens of thousands of images while using the same heuristic techniques as humans, is estimated to be 5% to 10% more accurate than the average physician.
On top of the findings we also get to know that the model can be at maximum 10% more accurate than the Skilled Radiologists.
P(Correct BI-RADS© score| Radiologist) = P(A intersection B)/ P(B)
P(A intersection B) = 0.333*0.11 P(B) = 0.33 P(A|B) = 0.11
[1] https://en.wikipedia.org/wiki/BI-RADS
[2] https://en.wikipedia.org/wiki/American_College_of_Radiology
[3] ACR Practice Guideline for the Performance of Ultrasound-Guided Percutaneous Breast Interventional Procedures Res. 29; American College of Radiology; 2009
[4] Sanders, M. A.; Roland, L.; Sahoo, S. (2010). "Clinical Implications of Subcategorizing BI-RADS 4 Breast Lesions associated with Microcalcification: A Radiology–Pathology Correlation Study". The Breast Journal. 16 (1): 28–31. DOI:10.1111/j.1524-4741.2009.00863.x PMID 19929890
[5] D'Orsi CJ, Sickles EA, Mendelson EB, Morris EA, et al. (2013). ACR BI-RADS® Atlas, Breast Imaging Reporting and Data System. Reston, VA: American College of Radiology.
[6] GE Vscan. https://www.gehealthcare.com/products/ultrasound/vscan-family/vscan-with-dual-probe
[7] Philips Lumify. https://www.usa.philips.com/healthcare/sites/lumify
[8] Forbes Article. https://www.forbes.com/sites/robertpearl/2018/03/13/artificial-intelligence-in-healthcare/#77e3e8c71d75
[9] Susan M. Love, Wendie A. Berg, Christine Podilchuk, Ana Lilia López Aldrete, Aarón Patricio Gaxiola Mascareño, Krishnamohan Pathicherikollamparambil,Ananth Sankarasubramanian, Leah Eshraghi, and Richard Mammone Palpable Breast Lump Triage by Minimally Trained Operators in Mexico Using Computer-Assisted Diagnosis and Low-Cost Ultrasound. https://ascopubs.org/doi/full/10.1200/JGO.17.00222
[10] Robert Pearl M.D. https://www.gsb.stanford.edu/faculty-research/faculty/robert-m-pearl
[11] Muhammad Imran Razzak, Saeeda Naz and Ahmad Zaib
. Deep Learning for Medical Image
Processing: Overview, Challenges and
Future. https://arxiv.org/ftp/arxiv/papers/1704/1704.06825.pdf
[12] ImageNet .http://www.image-net.org/
[13] Keras Applications https://keras.io/applications/
[14] Towards trustable machine learning. https://www.nature.com/articles/s41551-018-0315-x
[15] U-Net: Convolutional Networks for Biomedical Image Segmentation. https://lmb.informatik.uni-freiburg.de/people/ronneber/u-net/
[16] U-Net: Convolutional Networks for Biomedical Image Segmentation. https://arxiv.org/abs/1505.04597
[17] Attention Gated Networks. https://github.com/ozan-oktay/Attention-Gated-Networks/blob/master/train_classifaction.py
[18] Radiologist Level Accuracy Statistics. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3691059/
[19] Measuring the accuracy of diagnostic imaging in symptomatic breast patients: team and individual performance. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3486650/
Dataset Description
- mavd_1: "Dr. Berg Images", provided by Dr. Wendie Berg. MamaNetv1.0 is trained on this data.
- mavd_2 "ACRIN6666", provided by ACRIN: American College of Radiology Imaging Network which is a collation of data from multiple devices(including devices made by GE(General Electric), Philips, etc. )
- mavd_3 "NIH_Images", which is a collection of Ultrasound Images provided by NIH(National Institute of Health)
Symbol Name | Dataset | Proprietary | Sponsor/ Provider | Devices | Average Resolution | Number of Images | Size |
---|---|---|---|---|---|---|---|
mavd_1 | Dr. Berg Images | Yes | Dr. Wendie Berg | Philips | Unknown | 2267 | 1.35 GB |
mavd_2 | ACRIN6666 | Yes | American Cancer Radiology Imaging Network | Philips, GE | 1024 * 768 | 25409 | 5.35 GB |
mavd_3 | NIH_Images | Yes | National Institute of Health | Philips | Unknown | 1227 | 450.07 MB |
mavd_4 | Mexico_Guadalraja | Yes | Dr. Susan Love Foundation | GE VScan | Variable | 719 | 206.5 MB |
Dependencies
Libraries | Version | Link |
---|---|---|
Shapely | 1.6.4 | https://www.lfd.uci.edu/~gohlke/pythonlibs/#shapely |
Tensorflow | 1.13.1 | |
Scipy | 1.2.1 | |
Keras | 2.1.6 | |
Pandas | 0.24.2 | |
Pillow | 6.0.0 | |
OpenCV | 4.1.0.25 | |
Markdown | 3.1.1 |
Citations
[19] Britton P, Warwick J, Wallis MG, et al. Measuring the accuracy of diagnostic imaging in symptomatic breast patients: team and individual performance. Br J Radiol. 2012;85(1012):415–422. doi:10.1259/bjr/32906819